Pre-processing Wowza Streaming Engine (or Media Server) logs with Grok patterns syntax for better parsing with logstash.
>>>Click here to jump to the pattern to Copy & Paste.<<<
In a previous blog post I described setting up Logstash with Elasticsearch, Kibana and logstash-forwarder (lumberjack). In my logstash config file, I referenced a grok pattern for parsing a Wowza access log. I will describe that pattern, here.
This was created for Wowza Media Server 3.6 and works with Wowza Streaming Engine 4.0.
Grok
Using the grok filter in Logstash allows for patterns to be captured from a logstash field. These captured patterns can then become fully-fledged fields. Instead of a field called message
containing the whole Wowza log entry, it can be expanded so that each part of the message becomes a field.
A Wowza log entry looks like this:
2014-06-11\t06:53:34\tUTC\tcreate\tstream\tINFO\t200\t-\t-\t_defaultVHost_\tvod\t_definst_\t0.0\tstream.example.com\t1935\trtsp://stream.example.com:1935/vod/_definst_/mp4:FootballTV/450kbps/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4?session_token=xxxxx&username=steven\t219.92.86.40\trtsp\t-\tSonyC6603 Build/10.5.A.0.230 stagefright/1.2 (Linux;Android 4.4.2)\t1335194632\t0\t0\t3\t0\t0\t0\tamazons3/my-bucket/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4\tsession_token=xxxxx&username=steven\t-\t-\t-\t-\trtsp://stream.example.com:1935/vod/_definst_/mp4:FootballTV/450kbps/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4?session_token=xxxxx&username=steven\trtsp://stream.example.com:1935/vod/_definst_/mp4:FootballTV/450kbps/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4\tsession_token=xxxxx&username=steven\trtsp://stream.example.com:1935/vod/_definst_/mp4:FootballTV/450kbps/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4\tsession_token=xxxxx&username=steven
It's not very useful like this. I can't easily search through it. I certainly can't see information at a glance. Each of these tab-delimited fields means something. These "columns" have names which are specified in the user manual.
With grok, we can create new fields using the names in the Wowza manual and the values in the message field. Instead of a single message field, we can have:
- date
- 2014-06-11
- time
- 06:53:34
- tz
- UTC
- xEvent
- create
- xCategory
- stream
- xSeverity
- INFO
- xStatus
- 200
- xComment
- -
- xApp
- vod
- xAppinst
- _definst_
- sUri
- rtsp://stream.example.com:1935/vod/_definst_/mp4:FootballTV/450kbps/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4?session_token=xxxxx&username=steven
...and so on. This is much better.
Basic Patterns
Creating grok patterns is described in the logstash filters documentation.
Date
The default grok date formats do not include the YYYY-MM-DD
format which Wowza uses. We can use the default year, month number and day number.
YMDDATE %{YEAR}-%{MONTHNUM}-%{MONTHDAY}
This gives a syntax called YMDDATE
.
Fields
The fields in the log are separated by tab characters by default. Anything after the date and time AND starts with a tab character is a new field.
WOWZENTRY [^\t]+
This becomes a problem when the xComment field can be written to in your own Wowza Java modules through WMSLogger.info()
and similar methods. As that field can contain user data (things not written by Wowza), then it is not unreasonable to expect tab characters in that field.
By that reasoning, I added another, special capture after the xComment capture:
(?<xCommentMalformed>\t.*)?
This prevents tab characters in xComment from causing the grok pattern to break and start the middle of the xComment as other fields. It adds the offending part of the comment to a new field called xCommentMalformed. The field can have any unique name.
Specific fields
There are fields in the log which could be captured by the grok syntax defaults, such as IP
, PATH
and NONNEGINT
. These cannot be used because Wowza will replace the values in these fields with a hyphen (-) if they were not populated by the message.
To address this issue, I created some specific implementations of these syntax labels.
WOWZIP (?:%{IP}|-) WOWZNONNEGINT (?:%{NONNEGINT}|-) WOWZPATH (?:%{PATH}|-) WOWZNUMBER (?:%{NUMBER}|-)
The Full Pattern
The part you probably just wanted to copy and paste.
This creates a syntax named WOWZAACCESSLOG
.
This pattern is for Wowza Streaming Engine 4.1.
YMDDATE %{YEAR}-%{MONTHNUM}-%{MONTHDAY} WOWZENTRY [^\t]+ WOWZIP (?:%{IP}|-) WOWZNONNEGINT (?:%{NONNEGINT}|-) WOWZNUMBER (?:%{NUMBER}|-) WOWZPATH (?:%{PATH}|-) WOWZAACCESSLOG %{YMDDATE:date}\t%{TIME:time}\t%{TZ:tz}\t%{WOWZENTRY:xEvent}\t%{WORD:xCategory}\t%{LOGLEVEL:xSeverity}\t%{WOWZNONNEGINT:xStatus}\t%{WOWZENTRY:xCtx}\t%{WOWZENTRY:xComment}(?<xcommentmalformed>\t.*)?\t%{WOWZENTRY:xVhost}\t%{WOWZENTRY:xApp}\t%{WOWZENTRY:xAppinst}\t%{NUMBER:xDuration}\t%{WOWZENTRY:sIp}\t%{WOWZNONNEGINT:sPort}\t%{WOWZENTRY:sUri}\t%{WOWZIP:cIp}\t%{WOWZENTRY:cProto}\t%{WOWZENTRY:cReferrer}\t%{WOWZENTRY:cUserAgent}\t%{WOWZENTRY:cClientId}\t%{WOWZNONNEGINT:csBytes}\t%{WOWZNONNEGINT:scBytes}\t%{WOWZENTRY:xStreamId}\t%{WOWZNONNEGINT:xSpos}\t%{WOWZNONNEGINT:csStreamBytes}\t%{WOWZNONNEGINT:scStreamBytes}\t%{WOWZENTRY:xSname}\t%{WOWZENTRY:xSnameQuery}\t%{WOWZPATH:xFileName}\t%{WOWZENTRY:xFileExt}\t%{WOWZNONNEGINT:xFileSize}\t%{WOWZNUMBER:xFileLength}\t%{WOWZENTRY:xSuri}\t%{WOWZENTRY:xSuriStem}\t%{WOWZENTRY:xSuriQuery}\t%{WOWZENTRY:csUriStem}\t%{WOWZENTRY:csUriQuery}
This pattern is for older versions of Wowza Streaming Engine and Wowza Media Server. If a pattern gives you _grokparsefailure tags or other errors, try the other.
YMDDATE %{YEAR}-%{MONTHNUM}-%{MONTHDAY} WOWZENTRY [^\t]+ WOWZIP (?:%{IP}|-) WOWZNONNEGINT (?:%{NONNEGINT}|-) WOWZPATH (?:%{PATH}|-) WOWZAACCESSLOG %{YMDDATE:date}\t%{TIME:time}\t%{TZ:tz}\t%{WOWZENTRY:xEvent}\t%{WORD:xCategory}\t%{LOGLEVEL:xSeverity}\t%{WOWZNONNEGINT:xStatus}\t%{WOWZENTRY:xCtx}\t%{WOWZENTRY:xComment}(?<xCommentMalformed>\t.*)?\t%{WOWZENTRY:xVhost}\t%{WOWZENTRY:xApp}\t%{WOWZENTRY:xAppinst}\t%{NUMBER:xDuration}\t%{WOWZENTRY:sIp}\t%{WOWZNONNEGINT:sPort}\t%{WOWZENTRY:sUri}\t%{WOWZIP:cIp}\t%{WOWZENTRY:cProto}\t%{WOWZENTRY:cReferrer}\t%{WOWZENTRY:cUserAgent}\t%{WOWZENTRY:cClientId}\t%{WOWZNONNEGINT:csBytes}\t%{WOWZNONNEGINT:scBytes}\t%{WOWZENTRY:xStreamId}\t%{WOWZNONNEGINT:xSpos}\t%{WOWZNONNEGINT:csStreamBytes}\t%{WOWZNONNEGINT:scStreamBytes}\t%{WOWZENTRY:xSname}\t%{WOWZENTRY:xSnameQuery}\t%{WOWZPATH:xFileName}\t%{WOWZENTRY:xFileExt}\t%{WOWZNONNEGINT:xFileSize}\t%{WOWZNONNEGINT:xFileLength}\t%{WOWZENTRY:xSuri}\t%{WOWZENTRY:xSuriStem}\t%{WOWZENTRY:xSuriQuery}\t%{WOWZENTRY:csUriStem}\t%{WOWZENTRY:csUriQuery}
The difference is that x-file-length can be a decimal in newer logs, whereas my old pattern only looks for positive integers.
It's worth noting that these patterns work if your logging fields are in this order:
- date
- time
- tz
- x-event
- x-category
- x-severity
- x-statusx-ctx
- x-comment
- x-vhost
- x-app
- x-appinst
- x-duration
- s-ip
- s-port
- s-uri
- c-ip
- c-proto
- c-referrer
- c-user-agent
- c-client-id
- cs-bytes
- sc-bytes
- x-stream-id
- x-spos
- cs-stream-bytes
- sc-stream-bytes
- x-sname
- x-sname-query
- x-file-name
- x-file-ext
- x-file-size
- x-file-length
- x-suri
- x-suri-stem
- x-suri-query
- cs-uri-stem
- cs-uri-query
Using The Pattern In Logstash
I recommend reading a previous blog post for setting up logstash if you haven't already.
Save these patterns to a file. It can be any file as long as you remember the path and that logstash has permission to read it. Try /etc/logstash/patterns/wowza
.
Edit your logstash configuration file /etc/logstash/conf.d/logstash.conf
. Add a new grok filter:
filter { ... other filters ... grok { patterns_dir => "/etc/logstash/patterns" match => [ "message", "%{WOWZAACCESSLOG}" ] } }
Telling logstash the time
Logstash will use the current time to stamp the log entry. In order to tell logstash to use the time in the log message instead, it needs to be told where to find it.
The required date fields are date
, time
and tz
as defined in the WOWZACCESSLOG
grok pattern.
Create a new field containing a timestamp (named datetime
) using grok's add_field
function and then use the date
logstash filter to capture it:
grok { patterns_dir => "/etc/logstash/patterns" match => [ "message", "%{WOWZAACCESSLOG}" ] add_field => [ "datetime", "%{date} %{time} %{tz}" ] } date { match => [ "datetime", "yyyy-MM-dd HH:mm:ss Z" ] }
Conclusion
This post has covered the creation of custom grok syntax to parse a Wowza access log entry for use with logstash. It will separate the log parts into individual fields. This should make your Wowza logs considerably more useful and search-able, particularly when used with Elasticsearch.
When used in conjuction with my post covering the setting up of Elasticsearch, Logstash, Kibana and logstash-forwarder, these posts should give enough information to allow you to organise your Wowza logs.
Happy Wowza debugging!
P.S. On the subject of debugging, check out my post for debugging your Wowza modules code by stepping through the Java code in NetBeans (or any JIT debugger) remotely.
I gathered a lot of information through this article.Every example is easy to undestandable and explaining the logic easily.Thanks!
ReplyDeleteMicrosoft Windows Azure Training | Online Course | Certification in chennai | Microsoft Windows Azure Training | Online Course | Certification in bangalore | Microsoft Windows Azure Training | Online Course | Certification in hyderabad | Microsoft Windows Azure Training | Online Course | Certification in pune
Thanks for the informative and helpful post, obviously in your blog everything is good..
ReplyDeleteData Science Training
Honestly speaking this blog is absolutely amazing in learning the subject that is building up the knowledge of every individual and enlarging to develop the skills which can be applied in to practical one. Finally, thanking the blogger to launch more further too.
ReplyDeleteData Science Course in Bhilai
We absolutely like your blog and read almost all of your post’s to
ReplyDeletebe just what I’m looking for.
Java Training in Chennai
Java Course in Chennai
I'm cheerful I found this blog! Every now and then, understudies need to psychologically the keys of beneficial artistic articles forming. Your information about this great post can turn into a reason for such individuals.
ReplyDeleteServicenow Training and Placement in Hyderabad