How fast is GoAccess when parsing a log file?

Keep in mind that many factors can influence the parsing time, including processor, ram, log, etc. however, generally we could derive the following table:

GoAccess' benchmark — Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz 8GB RAM
Benchmark full features & metrics enabled (>=v0.9.5) - Default Hash Tables 87,816 lines per second
Benchmark full features & metrics enabled (>=v0.9.5) - On-Disk B+ Tree 23,000 lines per second
Benchmark full features & metrics enabled (>=v0.9.5) - In-memory hash table 46,000 lines per second

Note: A dataset of about 52M hits (12GB size) is parsed in 20 mins (in-memory), 60 mins (on-disk storage).

How can I configure the log/date/time format for Apache or Nginx?

If you are using the standard log format that comes with Apache or Nginx, configuring GoAccess should be pretty straight forward.

There are two ways to configure the log format. If you are outputting to a terminal (ncurses), the easiest is to run GoAccess with -c to prompt a configuration window. However this won't make it permanent, for that you will need to specify the format in the configuration file.

The configuration file is located under ~/.goaccessrc or %sysconfdir%/goaccess.conf where %sysconfdir% is either /etc/, /usr/etc/ or /usr/local/etc/.

In the configuration file you need to uncomment time-format log-format and date-format. The following should work for the standard Apache or Nginx log formats.

                            time-format %T
                            date-format %d/%b/%Y
                            log-format %h %^[%d:%t %^] "%r" %s %b "%R" "%u"

For more information, please check GoAccess' man page

How do I run GoAccess?

It's fairly easy to run GoAccesss, once it has been installed (no configuration is needed), just run it against your web log file: (-a is optional)

                           # goaccess -f /var/log/apache2/access.log -a

Filtering can be done through the use of pipes. For instance, using grep to filter specific data and then pipe the output into GoAccess. This adds a great amount of flexibility to what GoAccess can display. For example, to parse multiple log files:

                           # zcat -f access.log* | goaccess

For more examples, please check GoAccess' man page

How do I generate an HTML report?

To generate an HTML report, just run it against your web log file: (-a is optional)

                           # goaccess -f /var/log/apache2/access.log -a > report.html
                           # zcat -f /var/log/apache2/access.log* | goaccess -a > report.html
Note You can run GoAccess via cron as: cat /var/log/apache2/access.log | goaccess -a > report.html

What's the memory footprint of GoAccess?

GoAccess should not leak any memory, (tested with Valgrind), so mostly it will depend on the log size and features enabled. For 247,834 parsed lines is ~41.1 MiB (full features enabled).

Note: Removing the query string with -q can greatly decrease memory consumption, especially on timestamped requests.

What are the requirements to run GoAccess on my server?

GoAccess has minimal requirements, it's written in C and requires only ncurses.
Optionally, GeoIP for geolocation from MaxMind. See package details related to GoAccess

How to install GoAccess from source under OS X El Capitan?

The following instructions allow you to install GoAccess on OS X El Capitan without relying on homebrew. (Admin privileges needed)

  1. Install the latest XCode via the Mac App Store. Check if "Command Line Tools" were also installed by looking the directory /Library/Developer/CommandLineTools/
  2. If "Command Line Tools" are not present, install "XCode Command Line Tools" by typing in the terminal: xcode-select --install
  3. Download the latest version of GoAccess.
    Binaries, configuration file and man page are installed under /usr/local/
  4. Make sure to also add /usr/local/bin/ to $PATH under ~/.bash_profile so when you invoke the goaccess command you don't have to prepend /usr/local/bin/.
  5. You may now edit your goaccess configuration file located in /usr/local/etc/
  6. Enjoy!
Thanks to Valeriano for sharing this!

How to install Tokyo Cabinet from source?

This section describes how to install Tokyo Cabinet with the source package.

                          $ wget http://fallabs.com/tokyocabinet/tokyocabinet-1.4.48.tar.gz
                          $ tar -zxvf tokyocabinet-1.4.48.tar.gz
                          $ cd tokyocabinet-1.4.48
                          $ ./configure --prefix=/usr --enable-off64 --enable-fastest
                          $ make
                          # make install

How to use the on-disk database instead of keeping everything in memory?

If you have a large dataset that won't fit in physical memory or you want data persistence, then you want to use the B+ Tree on-disk database.

                          $ ./configure --enable-utf8 --enable-geoip --enable-tcb=btree
                          $ make
                          # make install

Note: You need to have Tokyo Cabinet installed prior to configure GoAccess. You can install Tokyo Cabinet from your package management tool (see dependencies table), or from source (see question above). You may also choose to disable compression, see configuration options for more details.

What features are you planning to add?

Here are some of the top features to add:

  • Ability to filter dataset by fields or regex. e.g. filter by fields such as host, request, etc.
  • Reduce memory footprint.
  • Increase performance when parsing the log file.
  • Add command-line options.
  • Add more reports to it.

Please see GitHub for more details.

GoAccess and Amazon S3...

Here's an extensible Amazon S3 and Cloudfront log parser in Python that uses GoAccess.
(Thanks to Viktor Nagy)

How can I configure IIS log format?

GoAccess has a generic predefined log format option in the config file & dialog.
However, this script can automatically extract the proper format from IIS log files.

For new releases...

If you would like to be notified of new releases of GoAccess then please follow the project on Twitter. Feel free to share it with others too :)