2014-06-11

Organising Wowza Logs with Logstash and Grok


Pre-processing Wowza Streaming Engine (or Media Server) logs with Grok patterns syntax for better parsing with logstash.

>>>Click here to jump to the pattern to Copy & Paste.<<<

In a previous blog post I described setting up Logstash with Elasticsearch, Kibana and logstash-forwarder (lumberjack). In my logstash config file, I referenced a grok pattern for parsing a Wowza access log. I will describe that pattern, here.

This was created for Wowza Media Server 3.6 and works with Wowza Streaming Engine 4.0.

Grok

Using the grok filter in Logstash allows for patterns to be captured from a logstash field. These captured patterns can then become fully-fledged fields. Instead of a field called message containing the whole Wowza log entry, it can be expanded so that each part of the message becomes a field.

A Wowza log entry looks like this:

2014-06-11\t06:53:34\tUTC\tcreate\tstream\tINFO\t200\t-\t-\t_defaultVHost_\tvod\t_definst_\t0.0\tstream.example.com\t1935\trtsp://stream.example.com:1935/vod/_definst_/mp4:FootballTV/450kbps/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4?session_token=xxxxx&username=steven\t219.92.86.40\trtsp\t-\tSonyC6603 Build/10.5.A.0.230 stagefright/1.2 (Linux;Android 4.4.2)\t1335194632\t0\t0\t3\t0\t0\t0\tamazons3/my-bucket/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4\tsession_token=xxxxx&username=steven\t-\t-\t-\t-\trtsp://stream.example.com:1935/vod/_definst_/mp4:FootballTV/450kbps/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4?session_token=xxxxx&username=steven\trtsp://stream.example.com:1935/vod/_definst_/mp4:FootballTV/450kbps/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4\tsession_token=xxxxx&username=steven\trtsp://stream.example.com:1935/vod/_definst_/mp4:FootballTV/450kbps/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4\tsession_token=xxxxx&username=steven

It's not very useful like this. I can't easily search through it. I certainly can't see information at a glance. Each of these tab-delimited fields means something. These "columns" have names which are specified in the user manual.

With grok, we can create new fields using the names in the Wowza manual and the values in the message field. Instead of a single message field, we can have:

date
2014-06-11
time
06:53:34

tz



UTC

xEvent



create

xCategory



stream

xSeverity



INFO

xStatus



200

xComment



-

xApp



vod

xAppinst



_definst_

sUri



rtsp://stream.example.com:1935/vod/_definst_/mp4:FootballTV/450kbps/BPLClipOfTheWeekArsenalVsFulham_450kbps.mp4?session_token=xxxxx&username=steven

...and so on. This is much better.

Basic Patterns

Creating grok patterns is described in the logstash filters documentation.

Date

The default grok date formats do not include the YYYY-MM-DD format which Wowza uses. We can use the default year, month number and day number.

YMDDATE %{YEAR}-%{MONTHNUM}-%{MONTHDAY}

This gives a syntax called YMDDATE.


Fields

The fields in the log are separated by tab characters by default. Anything after the date and time AND starts with a tab character is a new field.

WOWZENTRY [^\t]+

This becomes a problem when the xComment field can be written to in your own Wowza Java modules through WMSLogger.info() and similar methods. As that field can contain user data (things not written by Wowza), then it is not unreasonable to expect tab characters in that field.

By that reasoning, I added another, special capture after the xComment capture:

(?<xCommentMalformed>\t.*)?

This prevents tab characters in xComment from causing the grok pattern to break and start the middle of the xComment as other fields. It adds the offending part of the comment to a new field called xCommentMalformed. The field can have any unique name.


Specific fields

There are fields in the log which could be captured by the grok syntax defaults, such as IP, PATH and NONNEGINT. These cannot be used because Wowza will replace the values in these fields with a hyphen (-) if they were not populated by the message.

To address this issue, I created some specific implementations of these syntax labels.

WOWZIP (?:%{IP}|-)
WOWZNONNEGINT (?:%{NONNEGINT}|-)
WOWZPATH (?:%{PATH}|-)
WOWZNUMBER (?:%{NUMBER}|-)


The Full Pattern

The part you probably just wanted to copy and paste.

This creates a syntax named WOWZAACCESSLOG.

This pattern is for Wowza Streaming Engine 4.1.

YMDDATE %{YEAR}-%{MONTHNUM}-%{MONTHDAY}
WOWZENTRY [^\t]+
WOWZIP (?:%{IP}|-)
WOWZNONNEGINT (?:%{NONNEGINT}|-)
WOWZNUMBER (?:%{NUMBER}|-)
WOWZPATH (?:%{PATH}|-)
WOWZAACCESSLOG %{YMDDATE:date}\t%{TIME:time}\t%{TZ:tz}\t%{WOWZENTRY:xEvent}\t%{WORD:xCategory}\t%{LOGLEVEL:xSeverity}\t%{WOWZNONNEGINT:xStatus}\t%{WOWZENTRY:xCtx}\t%{WOWZENTRY:xComment}(?<xcommentmalformed>\t.*)?\t%{WOWZENTRY:xVhost}\t%{WOWZENTRY:xApp}\t%{WOWZENTRY:xAppinst}\t%{NUMBER:xDuration}\t%{WOWZENTRY:sIp}\t%{WOWZNONNEGINT:sPort}\t%{WOWZENTRY:sUri}\t%{WOWZIP:cIp}\t%{WOWZENTRY:cProto}\t%{WOWZENTRY:cReferrer}\t%{WOWZENTRY:cUserAgent}\t%{WOWZENTRY:cClientId}\t%{WOWZNONNEGINT:csBytes}\t%{WOWZNONNEGINT:scBytes}\t%{WOWZENTRY:xStreamId}\t%{WOWZNONNEGINT:xSpos}\t%{WOWZNONNEGINT:csStreamBytes}\t%{WOWZNONNEGINT:scStreamBytes}\t%{WOWZENTRY:xSname}\t%{WOWZENTRY:xSnameQuery}\t%{WOWZPATH:xFileName}\t%{WOWZENTRY:xFileExt}\t%{WOWZNONNEGINT:xFileSize}\t%{WOWZNUMBER:xFileLength}\t%{WOWZENTRY:xSuri}\t%{WOWZENTRY:xSuriStem}\t%{WOWZENTRY:xSuriQuery}\t%{WOWZENTRY:csUriStem}\t%{WOWZENTRY:csUriQuery}

This pattern is for older versions of Wowza Streaming Engine and Wowza Media Server. If a pattern gives you _grokparsefailure tags or other errors, try the other.

YMDDATE %{YEAR}-%{MONTHNUM}-%{MONTHDAY}
WOWZENTRY [^\t]+
WOWZIP (?:%{IP}|-)
WOWZNONNEGINT (?:%{NONNEGINT}|-)
WOWZPATH (?:%{PATH}|-)
WOWZAACCESSLOG %{YMDDATE:date}\t%{TIME:time}\t%{TZ:tz}\t%{WOWZENTRY:xEvent}\t%{WORD:xCategory}\t%{LOGLEVEL:xSeverity}\t%{WOWZNONNEGINT:xStatus}\t%{WOWZENTRY:xCtx}\t%{WOWZENTRY:xComment}(?<xCommentMalformed>\t.*)?\t%{WOWZENTRY:xVhost}\t%{WOWZENTRY:xApp}\t%{WOWZENTRY:xAppinst}\t%{NUMBER:xDuration}\t%{WOWZENTRY:sIp}\t%{WOWZNONNEGINT:sPort}\t%{WOWZENTRY:sUri}\t%{WOWZIP:cIp}\t%{WOWZENTRY:cProto}\t%{WOWZENTRY:cReferrer}\t%{WOWZENTRY:cUserAgent}\t%{WOWZENTRY:cClientId}\t%{WOWZNONNEGINT:csBytes}\t%{WOWZNONNEGINT:scBytes}\t%{WOWZENTRY:xStreamId}\t%{WOWZNONNEGINT:xSpos}\t%{WOWZNONNEGINT:csStreamBytes}\t%{WOWZNONNEGINT:scStreamBytes}\t%{WOWZENTRY:xSname}\t%{WOWZENTRY:xSnameQuery}\t%{WOWZPATH:xFileName}\t%{WOWZENTRY:xFileExt}\t%{WOWZNONNEGINT:xFileSize}\t%{WOWZNONNEGINT:xFileLength}\t%{WOWZENTRY:xSuri}\t%{WOWZENTRY:xSuriStem}\t%{WOWZENTRY:xSuriQuery}\t%{WOWZENTRY:csUriStem}\t%{WOWZENTRY:csUriQuery}

The difference is that x-file-length can be a decimal in newer logs, whereas my old pattern only looks for positive integers.

It's worth noting that these patterns work if your logging fields are in this order:

  1. date
  2. time
  3. tz
  4. x-event
  5. x-category
  6. x-severity
  7. x-statusx-ctx
  8. x-comment
  9. x-vhost
  10. x-app
  11. x-appinst
  12. x-duration
  13. s-ip
  14. s-port
  15. s-uri
  16. c-ip
  17. c-proto
  18. c-referrer
  19. c-user-agent
  20. c-client-id
  21. cs-bytes
  22. sc-bytes
  23. x-stream-id
  24. x-spos
  25. cs-stream-bytes
  26. sc-stream-bytes
  27. x-sname
  28. x-sname-query
  29. x-file-name
  30. x-file-ext
  31. x-file-size
  32. x-file-length
  33. x-suri
  34. x-suri-stem
  35. x-suri-query
  36. cs-uri-stem
  37. cs-uri-query

Using The Pattern In Logstash

I recommend reading a previous blog post for setting up logstash if you haven't already.

Save these patterns to a file. It can be any file as long as you remember the path and that logstash has permission to read it. Try /etc/logstash/patterns/wowza.

Edit your logstash configuration file /etc/logstash/conf.d/logstash.conf. Add a new grok filter:

filter {
    ... other filters ...
    grok {
        patterns_dir => "/etc/logstash/patterns"
        match => [ "message", "%{WOWZAACCESSLOG}" ]
    }
}

Telling logstash the time

Logstash will use the current time to stamp the log entry. In order to tell logstash to use the time in the log message instead, it needs to be told where to find it.

The required date fields are date, time and tz as defined in the WOWZACCESSLOG grok pattern.

Create a new field containing a timestamp (named datetime) using grok's add_field function and then use the date logstash filter to capture it:

grok {
    patterns_dir => "/etc/logstash/patterns"
    match => [ "message", "%{WOWZAACCESSLOG}" ]
    add_field => [ "datetime", "%{date} %{time} %{tz}" ]
}
date {
    match => [ "datetime", "yyyy-MM-dd HH:mm:ss Z" ]
}

Conclusion

This post has covered the creation of custom grok syntax to parse a Wowza access log entry for use with logstash. It will separate the log parts into individual fields. This should make your Wowza logs considerably more useful and search-able, particularly when used with Elasticsearch.

When used in conjuction with my post covering the setting up of Elasticsearch, Logstash, Kibana and logstash-forwarder, these posts should give enough information to allow you to organise your Wowza logs.

Happy Wowza debugging!

P.S. On the subject of debugging, check out my post for debugging your Wowza modules code by stepping through the Java code in NetBeans (or any JIT debugger) remotely.

13 comments:

  1. This is the first & best article to make me satisfied by presenting good content. I feel so happy and delighted.By Learn Digital Marketing Course Training in Chennai it will help to get Digital Marketing Training with Placement Institute in Chennai. If you Learn Social Media Marketing Training with Placement Institute in Chennai, you will get job soon.

    Learn Best Digital Marketing Course Training in Chennai in professional institute to get reputed job.

    ReplyDelete
  2. Cool stuff you have and you keep overhaul every one of us.

    Data Science Course

    ReplyDelete
  3. Thanks for the informative and helpful post, obviously in your blog everything is good..

    Data Science Training

    ReplyDelete
  4. Great Article
    Cloud Computing Projects


    Networking Projects

    Final Year Projects for CSE


    JavaScript Training in Chennai

    JavaScript Training in Chennai

    The Angular Training covers a wide range of topics including Components, Angular Directives, Angular Services, Pipes, security fundamentals, Routing, and Angular programmability. The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training

    ReplyDelete
  5. Terrific post thoroughly enjoyed reading the blog and more over found to be the tremendous one. In fact, educating the participants with it's amazing content. Hope you share the similar content consecutively.

    ai course in bhilai

    ReplyDelete
  6. Honestly speaking this blog is absolutely amazing in learning the subject that is building up the knowledge of every individual and enlarging to develop the skills which can be applied in to practical one. Finally, thanking the blogger to launch more further too.

    Data Science Course in Bhilai

    ReplyDelete
  7. Stupendous blog huge applause to the blogger and hoping you to come up with such an extraordinary content in future. Surely, this post will inspire many aspirants who are very keen in gaining the knowledge. Expecting many more contents with lot more curiosity further.

    Digital Marketing Course

    ReplyDelete
  8. Excellent blog thanks for sharing the valuable information..it becomes easy to read and easily understand the information.
    Useful article which was very helpful. also interesting and contains good information.
    to know about python training course , use the below link.

    Python Training in chennai

    Python Course in chennai




    ReplyDelete
  9. We absolutely like your blog and read almost all of your post’s to
    be just what I’m looking for.

    Java Training in Chennai

    Java Course in Chennai

    ReplyDelete
  10. I'm cheerful I found this blog! Every now and then, understudies need to psychologically the keys of beneficial artistic articles forming. Your information about this great post can turn into a reason for such individuals.
    Servicenow Training and Placement in Hyderabad

    ReplyDelete