Carnegie Mellon University

Frequently Asked Questions

No. We are happy to contribute this work to the public good, as we have since the project began.

We are happy to be part of a growing milieu of projects for causal search, simulation, and estimation! We plan to list other causal projects we know of here; they all have different foci and implement different algorithms, so it helps to be aware of the range of functionality available. They are also on different platforms, but we feel that one should break down these somewhat artificial boundaries to the extent possible. This will be a very incomplete list for a while, but we will add to it.

  • Under the direction of Kun Zhang, a separate project in Python was initiated in the Philosophy Department at Carnegie Mellon University, causal-learn, that directly translates some Tetrad algorithms into Python and makes these and other novel algorithms available to the Python world. This project is now part of the Py-Why A working paper describing causal-learn may be found on arXiv here:  Zheng, Yujia, Biwei Huang, Wei Chen, Joseph Ramsey, Mingming Gong, Ruichu Cai, Shohei Shimizu, Peter Spirtes, and Kun Zhang. "Causal-learn: Causal Discovery in Python." arXiv preprint arXiv:2307.16405 (2023).
  • We should mention separately the Py-Why space, a Python project that includes estimation and graph representation tools.
  • There is the LiNGAM project of Shimizu, now included in causal-learn, for the linear non-Gaussian case.
  • The DoWhy project is another very nice project in Python.
  • PCALG is a large causal project in R, with some algorithms not implemented in Tetrad. A reference for this project is Kalisch, M., Mächler, M., Colombo, D., Maathuis, M. H., & Bühlmann, P. (2012). Causal inference using graphical models with the R package pcalg. Journal of statistical software47, 1-26.
  • Bnlearn is another large causal project in R, with algorithms not implemented in Tetrad. A reference is Scutari, M., & Ness, R. (2012). bnlearn: Bayesian network structure learning, parameter learning, and inference. R package version3, 805. 

You may be interested in our Tetrad application, the Python interface, the R interface, or the command line tool. Please see the pages for download instructions.

The history of our source code since 2015 is available in our public GitHub repository for the Tetrad project. Feel free to peruse it and give us comments. Source code and javadocs for particular releases are available on Maven Central

Documentation, including our manual and our latest javadocs online, is available

For a quick introduction, the following search algorithms are supported for the causally sufficient case:

  • PC, CPC, PC-MAX, CCD, CPC, FGES, GRaSP, SP, FASK, IMaGES, LiNGAM

The following algorithms are available for the causally insufficient case (i.e., where unmeasured common causes are possible):

  • FCI, FCI-Max, GFCI, GRaSP-FCI, RFCI, SP-FCI, SVarFCI, SVarGFCI

For Markov blanket search:

  • FGES-MB, MBFS

For undirected graphs:

  • FAS, MGM

For pairwise orientation:

  • R3, Skew, RSkew, FASK-PW

For searching for structure over latents:

  • BPC, FOFC, FTFC

Many of these algorithms accept knowledge the form of lists/tiers of forbidden and/or required edges.

Many of these algorithms use conditional information or score information; such independence tests or scores are available for the continuous, discrete, and mixed continuous/discrete cases.

Many of these algorithms can use a d-separation oracle as input to test functionality.

There is much additional functionality; please see our manual.

We're always happy to get feedback, so thanks in advance! Good or bad! The best way to give us feedback is to open an issue for us on our GitHub site. You can browse there to see if the question has been raised and answered. If the issue is open and unresolved, you can add your voice to it. 

Bug reports are fantastic. We aim for our software to be bug-free. So please, if you see something, say something! We aim for each major and minor release to make sure all bugs known at the time are fixed. 

Requests for improvements and new features are welcome as well. Even though our team is small, we can generally address the feedback that comes our way, so bring it on! We maintain a wish list; perhaps you want to vote for some particular item or add new items.  Some requests are for significant new features; we will ponder these until we have a good response and can do the implementations.

Code contributions consistent with the goals of the project are welcome! Contributions may be made through the GitHub pull request process and will be reviewed.

Certainly, quite a lot. Please see our Ongoing Projects page.

For nitty gritty details, please see our release history.

For our recent downloadables, please see our page on Maven Central.

No, the server they depend on is no longer available. The current and all future versions of TETRAD will be on Maven Central and will only include versions starting with 7.1.0. Data files and graph files from earlier versions of TETRAD may be readable in the current version, but entire sessions will not be. We recommend everyone transition to the current stable version.

This used to be an issue, but it is no longer one.

The Tetrad project includes robust methods for testing its own software. In the Tetrad application, methods are available to simulate data, run algorithms on that data, and compare the output to the known true graph from the simulation. Where feasible, algorithms may be tested using a d-separation oracle, and all such algorithms are so correct, so that any errors that might accrue are due to unavoidable statistical errors in testing or scores. We invite you to explore these tools to convince yourself that the software is working correctly. Simulated data and graphs may be generated in the Tetrad application and saved out for testing with other software platforms to compare. In fact, we have an API called 'algcomparison' which will do rigorous simulation testing of algorithms with publication-quality comparison tables using graph statistics of the user's choice. We are proposing to translate this comparison software into Python for wider adoption, as we find it very helpful to compare different algorithms head-to-head on identical problems using identical tools, and Python is an excellent language to use to compare across many different languages.

One might say that the only purpose of making causal search software is to analyze real data, and one will undoubtedly be correct. One hurdle to analyzing real data is that formats and provenance of real datasets can vary so widely, and ground truth can be difficult to ascertain. To simplify these problems and systematize our analysis of real data, we have made a repository of real datasets, also with some realistically simulated data, pre-formatted to make them easy to load into Tetrad and other tools, with what ground truth we are able to ascertain. Please let us know if you have a public dataset, especially one commonly used, that you would like to be made available in this repository. 

The choice is partly historical (Java used to be a super-cool language), though, to be fair, we have found Java to be an excellent language for our project for a number of reasons. 

  • Tetrad is a very large project at this point, and the language's strong typing and object orientation allow this to be feasible. Strong typing, for instance, makes refactoring very easy. 
  • Java has allowed us to make a cross-platform GUI application available, which has proven to be quite useful for educational purposes for users who want to learn about causal inference but prefer not to work directly with code.  
  • Since Java is relatively easy to parallelize and can be made to run well on large machines, it has allowed us to scale some of our algorithms up considerably, something that has proven challenging to date in Python. 
  • Fourth, Java is a fairly fast language, with good implementations competing in speed with C++. 
  • Java is inherently a safe language without the platform security issues associated with many other languages. 

That said, Python has a lot of energy behind it, so we have made forays into Python. See our "Tetrad in Python" page. We have also done recent work in making Tetrad available in R. In the past, Python support and R support were made available using the py-causal and r-causal packages; these packages are now deprecated, as they use a very old version of Tetrad; users should switch to our more recent Python and R interfaces.

The current version of TETRAD does not support graphic card processing (we are looking into it). Portions of Tetrad are parallelized, such as FGES and the adjacency search of PC (especially if the 'stable' option is selected). We want to improve parallelization and extend it to more algorithms and API's in the repository. 

This problem has been fixed in the most recent version of Tetrad; to solve the problem, simply do a 'git pull' (or in the IntelliJ interface, Git->Update Project, and the problem will magically disappear.