
Apophenia is an open statistical library for working with data sets and statistical models. It provides functions on the same level as those of the typical stats package (such as OLS, probit, or singular value decomposition) but gives the user more flexibility to be creative in modelbuilding. The core functions are written in C, but experience has shown them to be easy to bind to in Python/Julia/Perl/Ruby/&c.
It is written to scale well, to comfortably work with gigabyte data sets, millionstep simulations, or computationallyintensive agentbased models. If you have tried using other open source tools for computationally demanding work and found that those tools weren't up to the task, then Apophenia is the library for you.
The library has been growing and improving since 2005, and has been downloaded over 10,000 times. To date, it has over two hundred functions to facilitate statistical computing, such as:
For the full list, click the index link from the header.
Most users will just want to download the latest packaged version linked from the Download Apophenia here header.
Those who would like to work on a cuttingedge copy of the source code can get the latest version by cutting and pasting the following onto the command line. If you follow this route, be sure to read the development README in the Apophenia
directory this command will create.
git clone https://github.com/bk/Apophenia.git
To start off, have a look at this Gentle Introduction to the library.
The outline gives a more detailed narrative.
The index lists every function in the library, with detailed reference information. Notice that the header to every page has a link to the outline and the index.
To really go in depth, download or pick up a copy of Modeling with Data, which discusses general methods for doing statistics in C with the GSL and SQLite, as well as Apophenia itself. A Crossparadigm Modeling Framework (PDF) discusses some of the theoretical structures underlying the library.
There is a wiki with some convenience functions, tips, and so on.
Much of what Apophenia does can be done in any typical statistics package. The apop_data element is much like an R data frame, for example, and there is nothing special about being able to invert a matrix or take the product of two matrices with a single function call (apop_matrix_inverse and apop_dot, respectively). Even more advanced features like Loess smoothing (apop_loess) and the Fisher Exact Test (apop_test_fisher_exact) are not especially Apopheniaspecific. But here are some things that are noteworthy.
apop_opts.nan_string = "N/A";
). Or there can be no delimiters, as in the case of fixedwidth files. If you are a heavy SQLite user, Apophenia may be useful to you simply for its apop_text_to_db function.eps
), just add a settings group specifying the tolerance at which the cycle should stop: Apop_settings_add_group(your_model, apop_mle, .dim_cycle_tolerance=eps)
.If you're interested, write to the maintainer (Ben Klemens), or join the GitHub project.