Keep in mind that many factors can influence the parsing time, including processor, ram, log, etc. however, generally we could derive the following table:
|Benchmark with full features & metrics enabled (v0.9) - GLib Hash Tables||50,000 lines per second|
|Benchmark with full features & metrics enabled (v0.9) - On-Disk B+ Tree||23,000 lines per second|
|Benchmark with full features & metrics enabled (v0.9) - In-memory hash table||46,000 lines per second|
If you are using the standard log format that comes with Apache or Nginx, configuring GoAccess should be pretty straight fordward.
There are two ways to configure the log format. If you are outputting to a
terminal (ncurses), the easiest is to run GoAccess with
prompt a configuration window. However this won't make it permanent, for that
you will need to specify the format in the configuration file.
The configuration file is located under
%sysconfdir% is either
In the configuration file you need to uncomment
The following should work for the standard Apache or Nginx formats.
time-format %T date-format %d/%b/%Y log-format %h %^[%d:%^] "%r" %s %b "%R" "%u"
For more information, please check GoAccess' man page
It's fairly easy to run GoAccesss, once it has been installed (no configuration is needed), just run it against your web log file: (-a is optional)
# goaccess -f /var/log/apache2/access.log -a
Now if we want to add more flexibility to GoAccess, we can do a series of pipes. For instance: If we would like to process all
access.log.*.gz we can do:
# zcat -f access.log* | goaccess
For more examples, please check GoAccess' man page
To generate an HTML report, just run it against your web log file: (-a is optional)
# goaccess -f /var/log/apache2/access.log -a > report.html OR # zcat -f /var/log/apache2/access.log* | goaccess -a > report.htmlNote You can run GoAccess via
cat /var/log/apache2/access.log | goaccess -a > report.html
GoAccess should not leak any memory, (tested with
so mostly it will depend on the log size and the initial parse.
For 496,750 parsed lines is
~36.9 MiB (full features enabled).
Note: Removing the query string with
-q can greatly decrease memory consumption, especially on timestamped requests.
Here's an extensible Amazon S3 and Cloudfront
log parser in Python that uses GoAccess.
(Thanks to Viktor Nagy)
This section describes how to install Tokyo Cabinet with the source package.
$ wget http://fallabs.com/tokyocabinet/tokyocabinet-1.4.48.tar.gz $ tar -zxvf tokyocabinet-1.4.48.tar.gz $ cd tokyocabinet-1.4.48 $ ./configure --prefix=/usr --enable-off64 --enable-fastest $ make # make install
If you have a large dataset that won't fit in physical memory or you want data persistence, then you want to use the B+ Tree on-disk database.
$ ./configure --enable-utf8 --enable-geoip --enable-tcb=btree $ make # make install
Note that you need to have Tokyo Cabinet installed prior to configure GoAccess. You can install Tokyo Cabinet from your package management tool (see dependencies table), or from source (see question above). You may also choose to disable compression, see configuration options for more details.
Here are some of the top features to add:
Please see GitHub for more details.
Thanks to Chris Orgill, GoAccess has been successfully built under OpenBSD.
(ksh) # pkg_add GeoIP # pkg_add glib2 # export LDFLAGS=-L/usr/local/lib # ./configure --enable-geoip --enable-utf8 # make # make install
Runs to completion. Make sure your terminal supports UTF8, e.g.,
If you would like to be notified of new releases of GoAccess then please follow the project on Twitter. Feel free to share it with others too :)