PCRE — Regular Expressions (Perl-Compatible)

PCRE Patterns


$ top -n1 -b | head -n 1
top - 11:41:04 up 21 min,  1 user,  load average: 0.00, 0.01, 0.07

Character classes

$message = "top - 11:41:04 up 21 min,  1 user,  load average: 0.00, 0.01, 0.07";
$pattern = "/\d\d:\d\d:\d\d/";
preg_match($pattern, $message, $matches);
var_dump($matches); //=> array(1) { [0]=>string(8) "11:41:04" }

Repetition

$message = "top - 11:41:04 up 21 min,  1 user,  load average: 0.00, 0.01, 0.07";
$pattern = "/\d{2}:\d{2}:\d{2}/";
preg_match($pattern, $message, $matches);
var_dump($matches); //=> array(1) { [0]=>string(8) "11:41:04" }

Capturing group

$message = "top - 11:41:04 up 21 min,  1 user,  load average: 0.00, 0.01, 0.07";
$pattern = "/load average: (.+), (.+), (.+)/";
preg_match($pattern, $message, $matches);
var_dump($matches);
//=>
// array(4) {
//   [0]=>string(30) "load average: 0.00, 0.01, 0.07"
//   [1]=>string(4) "0.00"
//   [2]=>string(4) "0.01"
//   [3]=>string(4) "0.07"
// }

Alternation

Mac OS:

$ top -l3 -n1 | head -n 4 | tail -n 1
CPU usage: 14.79% user, 20.91% sys, 64.28% idle

Linux:

top -n1 -b | head -n 3 | tail -n 1
%Cpu(s):  1.2 us,  0.6 sy,  0.1 ni, 98.0 id,  0.1 wa,  0.0 hi,  0.1 si,  0.0 st

CPU User:

$message = "%Cpu(s):  1.2 us,  0.6 sy,  0.1 ni, 98.0 id,  0.1 wa,  0.0 hi,  0.1 si,  0.0 st";
$pattern = "/Cpu\(s\):  (.+) us|CPU usage: (.+)% user/";
preg_match($pattern, $message, $matches);
var_dump($matches);
//=>
// array(2) {
//  [0]=>string(15) "Cpu(s):  1.2 us"
//  [1]=>string(3) "1.2"
// }

References:

Functions


preg_match()

Top

$ top -n1 -b
top - 11:41:04 up 21 min,  1 user,  load average: 0.00, 0.01, 0.07
Tasks:  86 total,   1 running,  85 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.7 us,  2.2 sy,  0.5 ni, 92.0 id,  0.3 wa,  0.0 hi,  0.3 si,
KiB Mem:    501692 total,   475348 used,    26344 free,    13032 buffers
KiB Swap:        0 total,        0 used,        0 free.   281204 cached

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+
    1 root      20   0   33632   2940   1468 S  0.0  0.6   0:01.87
    2 root      20   0       0      0      0 S  0.0  0.0   0:00.01
...
17836 www-data  20   0  287448   6708   2424 S  0.0  1.3   0:00.02
17914 vagrant   20   0   23528   1396   1048 R  0.0  0.3   0:00.00

Regular Expression (Tool):

/Tasks:\s+(\d+) total,\s+(\d+) running,\s+(\d+) sleeping,\s+(\d+) stopped,\s+(\d+) zombie/

codes/top.php:

<pre>
<?php
$top = shell_exec("top -n1 -b");
$regex = "/Tasks:  (\d+) total,\s+(\d+) running,\s+(\d+) sleeping,\s+(\d+) stopped,\s+(\d+) zombie/";
preg_match($regex, $top, $matches);
var_dump($matches);
?>

http://localhost:8080/pcre/codes/top.php:

$ curl -i http://localhost:8080/pcre/codes/top.php
HTTP/1.1 200 OK
Server: Apache/2.4.38 (Debian)
X-Powered-By: PHP/7.3.12
Vary: Accept-Encoding
Content-Length: 233
Content-Type: text/html; charset=UTF-8

<pre>
array(6) {
  [0]=>string(68) "Tasks:  86 total,   1 running,  85 sleeping,   0 stopped,   0 zombie"
  [1]=>string(2) "86"
  [2]=>string(1) "1"
  [3]=>string(2) "85"
  [4]=>string(1) "0"
  [5]=>string(1) "0"
}

Crontab

codes/crontab.txt:

30 08 10 06 * /scripts/full-backup
00 11,16 * * * /scripts/incremental-backup
00 09-18 * * * /scripts/check-db-status
00 09-18 * * 1-5 /scripts/check-db-status
*/10 * * * * /scripts/monitor.sh
@yearly /scripts/annual-maintenance
@monthly /scripts/tape-backup
@daily /scripts/cleanup-logs
@reboot /script/start-service-x

Regular Expression (Tool):

(@yearly|@monthly|@daily|@reboot) (.+)|([^ ]+) ([^ ]+) ([^ ]+) ([^ ]+) ([^ ]+) (.+)

codes/crontab.php:

<?php
function crontrab_parse($crontab)
{
  $crontabArray = [];
  foreach (explode("\n", $crontab) as $cronLine) {
    $regex = '/(@yearly|@monthly|@daily|@reboot) (.+)|([^ ]+) ([^ ]+) ([^ ]+) ([^ ]+) ([^ ]+) (.+)/';
    preg_match($regex, $cronLine, $matches);
    if (in_array($matches[1], ["@yearly", "@monthly", "@daily", "@reboot"])) {
      $crontabArray[] = [
        "keyword" => $matches[1],
        "task" => $matches[2],
      ];
    } else {
      $crontabArray[] = [
        "minute" => $matches[1],
        "hour" => $matches[2],
        "day" => $matches[3],
        "month" => $matches[4],
        "weekDay" => $matches[5],
        "task" => $matches[6],
      ];
    }
  }
  return $crontabArray;
}

$crontab = file_get_contents('crontab.txt');
header("Content-type: application/json; charset=UTF-8");
header("Access-Control-Allow-Origin: *");
echo json_encode(crontrab_parse($crontab));

http://localhost:8080/pcre/codes/crontab.php:

$ curl -i http://localhost:8080/pcre/codes/crontab.php
HTTP/1.1 200 OK
Date: Wed, 04 Dec 2019 12:29:18 GMT
Server: Apache/2.4.38 (Debian)
X-Powered-By: PHP/7.3.12
Access-Control-Allow-Origin: *
Content-Length: 606
Content-Type: application/json; charset=UTF-8

[{"minute":"","hour":"","day":"30","month":"08","weekDay":"10","task":"06"},{"minute":"","hour":"","day":"00","month":"11,16","weekDay":"*","task":"*"},{"minute":"","hour":"","day":"00","month":"09-18","weekDay":"*","task":"*"},{"minute":"","hour":"","day":"00","month":"09-18","weekDay":"*","task":"*"},{"minute":"","hour":"","day":"*\/10","month":"*","weekDay":"*","task":"*"},{"keyword":"@yearly","task":"\/scripts\/annual-maintenance"},{"keyword":"@monthly","task":"\/scripts\/tape-backup"},{"keyword":"@daily","task":"\/scripts\/cleanup-logs"},{"keyword":"@reboot","task":"\/script\/start-service-x"}]
[
  {
    minute: "",
    hour: "",
    day: "30",
    month: "08",
    weekDay: "10",
    task: "06"
  },
  {
    minute: "",
    hour: "",
    day: "00",
    month: "11,16",
    weekDay: "*",
    task: "*"
  },
  {
    minute: "",
    hour: "",
    day: "00",
    month: "09-18",
    weekDay: "*",
    task: "*"
  },
  {
    minute: "",
    hour: "",
    day: "00",
    month: "09-18",
    weekDay: "*",
    task: "*"
  },
  {
    minute: "",
    hour: "",
    day: "*/10",
    month: "*",
    weekDay: "*",
    task: "*"
  },
  {
    keyword: "@yearly",
    task: "/scripts/annual-maintenance"
  },
  {
    keyword: "@monthly",
    task: "/scripts/tape-backup"
  },
  {
    keyword: "@daily",
    task: "/scripts/cleanup-logs"
  },
  {
    keyword: "@reboot",
    task: "/script/start-service-x"
  }
];

preg_match_all()

Ping API

$ ping -c2 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=63 time=148 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=63 time=144 ms

--- 8.8.8.8 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1024ms
rtt min/avg/max/mdev = 144.691/146.669/148.648/2.015 ms
Shell script mode

codes/ping.sh:

#!/bin/bash
ip=$1
cnt=$2
resping=`ping -c$cnt $ip`
temp=`echo -e "$resping \n" | grep icmp_seq | sed -E "s/(.+) bytes from (.+): icmp_seq=(.+) ttl=(.+) time=(.+) ms/\1 \2 \3 \4 \5/"`
n_linhas=`echo -e "$temp \n" | wc -l`
res="[\n"
for (( i=1; i < $n_linhas; i++ ))
do
	for (( j=1; j <= 5; j++ ))
	do
		rem=`echo -e "$temp \n" | sed "$i! d" | cut -f$j -d" "`
		case $j in
			1)
				bytes="\"bytes\":$rem,"
			;;
			2)
				ip="\"ip\":$rem,"
			;;
			3)
				icmp_seq="\"icmp_seq\":$rem,"
			;;
			4)
				ttl="\"ttl\":$rem,"
			;;
			5)
				timen="\"time\":$rem"
			;;
		esac
	done
	if (( $i == $n_linhas-1 ))
	then
		res="$res\t{\n\t\t$bytes\n\t\t$ip\n\t\t$icmp_seq\n\t\t$ttl\n\t\t$timen\n\t}\n"
	else
		res="$res\t{\n\t\t$bytes\n\t\t$ip\n\t\t$icmp_seq\n\t\t$ttl\n\t\t$timen\n\t},\n"
	fi
done
res="$res]"
echo -e $res

# chmod +x ping.sh
# ./ping.sh 8.8.8.8 3
[
        {
                "bytes":64,
                "ip":8.8.8.8,
                "icmp_seq":2,
                "ttl":37,
                "time":84.5
        }
]

codes/ping-sh.php:

<?php
	function ping($host, $count) {
		$command = "./ping.sh ${host} ${count}";
		return shell_exec($command);
	}

	$host = $_GET['host'] ?? '';
	$count = $_GET['count'] ?? '1';

	header("Content-type: application/json; charset=UTF-8");
  header("Access-Control-Allow-Origin: *");
	echo ping($host, $count);

http://localhost:8080/pcre/codes/ping-sh.php?host=8.8.8.8

$ curl -i http://localhost:8080/pcre/codes/ping-sh.php\?host\=8.8.8.8
HTTP/1.1 200 OK
Server: Apache/2.4.38 (Debian)
X-Powered-By: PHP/7.3.12
Access-Control-Allow-Origin: *
Content-Length: 82
Content-Type: application/json; charset=UTF-8

[
        {
                "bytes":64,
                "ip":8.8.8.8,
                "icmp_seq":1,
                "ttl":37,
                "time":77.5
        }
]
PCRE mode
/\(([\d\.]+)\)/

/icmp_seq=(\d+) ttl=(\d+) time=([\d\.]+)/

/(\d+) packets transmitted, (\d+) (packets received|received)/

/min\/avg\/max\/(stddev|mdev) = ([\d\.]+)\/([\d\.]+)\/([\d\.]+)\/([\d\.]+)/

codes/ping.php:

<?php
  function ping($host, $count) {
    $pingInfo = [];

    $result = ping_command($host, $count);

    if ($host && $result) {
      $pingInfo["host"] = $host;
      $pingInfo += ping_encode($result);
    } else {
      http_response_code(500);
      $pingInfo['error'] = 'Unknown host';
    }

    header("Content-type: application/json; charset=UTF-8");
    header("Access-Control-Allow-Origin: *");
    echo json_encode($pingInfo);
  }

  function is_unknown_host($result) {
    return strpos($result, 'Unknown host') !== false;
  }

  function ping_command($host, $count) {
    $command = "ping -c{$count} {$host}";
    $result = shell_exec($command);
    return is_unknown_host($result) ? NULL : $result;
  }

  function ping_encode($result) {
    $json = [];

    // ip
    $regex = "/\(([\d\.]+)\)/";
    preg_match($regex, $result, $matches);
    $json["ip"] = $matches[1];
    
    // packets
    $json["packets"] = [];
    $regex = "/icmp_seq=(\d+) ttl=(\d+) time=([\d\.]+)/";
    preg_match_all($regex, $result, $matches);
    foreach($matches[1] as $key => $sequence){
      $json["packets"][] = [
        "seq" => (int) $matches[1][$key],
        "ttl" => (int) $matches[2][$key],
        "time" => (float) $matches[3][$key]
      ];
    }
    
    // statistics
    $json["statistics"] = [];
    $regex = "/(\d+) packets transmitted, (\d+) (packets received|received)/";
    preg_match($regex, $result, $matches);
    $json["statistics"]["transmitted"] = (int) $matches[1];
    $json["statistics"]["received"] = (int) $matches[2];
    $json["statistics"]["losted"] = $matches[1] - $matches[2];

    $regex = "/min\/avg\/max\/(stddev|mdev) = ([\d\.]+)\/([\d\.]+)\/([\d\.]+)\/([\d\.]+)/";
    preg_match($regex, $result, $matches);
    $json["statistics"]["min"] = (float) $matches[1];
    $json["statistics"]["avg"] = (float) $matches[2];
    $json["statistics"]["max"] = (float) $matches[3];
    $json["statistics"]["stddev"] = (float) $matches[4];

    return $json;
  }

  $host = $_GET["host"] ?? null;
  $count = $_GET["count"] ?? 1;

  ping($host, $count);
?>

http://localhost:8080/pcre/codes/ping.php:

$ curl -i http://localhost:8080/pcre/codes/ping.php
HTTP/1.1 500 Internal Server Error
Host: localhost:8090
Connection: close
X-Powered-By: PHP/7.3.12
Content-type: application/json; charset=UTF-8
Access-Control-Allow-Origin: *

{
  "error":"Unknown host"
}

http://localhost:8080/pcre/codes/ping.php?host=test:

$ curl -i http://localhost:8080/pcre/codes/ping.php?host=test
HTTP/1.1 500 Internal Server Error
Host: localhost:8090
Connection: close
X-Powered-By: PHP/7.3.12
Content-type: application/json; charset=UTF-8
Access-Control-Allow-Origin: *

{
  "error":"Unknown host"
}

http://localhost:8080/pcre/codes/ping.php?host=8.8.8.8:

$ curl -i http://localhost:8080/pcre/codes/ping.php?host=8.8.8.8
HTTP/1.1 200 OK
Host: localhost:8090
Connection: close
X-Powered-By: PHP/7.3.12
Content-type: application/json; charset=UTF-8
Access-Control-Allow-Origin: *

{
  "host":"8.8.8.8",
  "ip":"8.8.8.8",
  "packets":[
    {
      "seq":0,
      "ttl":49,
      "time":143.257
    }
  ],
  "statistics":{
    "transmitted":1,
    "received":1,
    "losted":0,
    "min":0,
    "avg":143.257,
    "max":143.257,
    "stddev":143.257
  },
}

http://localhost:8080/pcre/codes/ping.php?host=8.8.8.8&count=2:

$ curl -i http://localhost:8080/pcre/codes/ping.php?host=8.8.8.8&count=2
HTTP/1.1 200 OK
Host: localhost:8090
Connection: close
X-Powered-By: PHP/7.3.12
Content-type: application/json; charset=UTF-8
Access-Control-Allow-Origin: *

{
  "host":"8.8.8.8",
  "ip":"8.8.8.8",
  "packets":[
    {
      "seq":0,
      "ttl":49,
      "time":142.184
    },
    {
      "seq":1,
      "ttl":49,
      "time":141.717
    }
  ],
  "statistics":{
    "transmitted":2,
    "received":2,
    "losted":0,
    "min":0,
    "avg":141.717,
    "max":141.951,
    "stddev":142.184
  }
}