Sourceforge Reformatted is the world's largest open source software development and distribution environment. It hosts more than one hundred thousand projects. You can get the list of projects described by their name, description, license, programming language, topic, registration date, activity, number of downloads, screenshot and more. For instance, you can use it to better search for the project you need or for analyzing software trends.

The "Sourceforge Reformatted" dataset is here available for purchasing at the price of $1'100.00 USD.

This is a screenshot of the February 2010 dataset within Microsoft Excel, showing projects sorted by number of downloads.

Download sample data (from February 2010) in Microsoft Excel, Microsoft Access, xml and mysql formats.


Each project is described by title, description, image, activity, rank, register date, date of the latest file downloaded and number of downloads. It also says whether the application runs in Ms Windows, Macintosh, Unix and PDAs (this is an educated guess, see below for more details on the actual operating systems). The project is normalized also on whether it has a graphical user interface, runs from the terminal (or command-line) or runs without any interaction. Each project is further described with a set of 9 additional tables, telling the database environment, development status, intended audience, license, operating system, programming language, topic, translation and user interface associated to each project.

With this dataset, you can answer questions such as:

  • List all projects on Education, with the keyword 'portal' in the description, and having been downloaded at least one thousand times:
    Carso's Virtual UniversityA e-learning portal that facilitate the university knowledge and student...
    SchoolAlumni PortalPortal software for alumni high school, content management system, and online...
    Distributed ISBN portalA distributed search portal of common sources of ISBN numbers...
  • How many new projets have been registered each year for the following functional programming languages: erlang, Groovy, Haskell, Lisp, Scala.

Note: Contact us if you need help querying this dataset. This dataset is also available for previous years.


Here there is the of distinct values for each of 9 additional tables:



This product and this website are not approved, endorsed or authorized by the team.

Terms of use. This dataset may not be reproduced or stored in any other website or included in any public or private electronic retrieval system or service without our prior written permission. Any rights not expressly granted in these terms are reserved.

Use at Your Own Risk. We provide the material available through this website for informational purposes only. We try to ensure that information is accurate, and that the services offered are reliable. Yet despite our efforts, errors may occur from time to time. Before you act on any information you've found on our site, you should confirm any facts that are important to your decision. IF YOU RELY ON ANY INFORMATION OR SERVICE AVAILABLE THROUGH THIS WEBSITE, YOU DO SO AT YOUR OWN RISK. YOU UNDERSTAND THAT YOU ARE SOLELY RESPONSIBLE FOR ANY DAMAGE OR LOSS YOU MAY INCUR THAT RESULTS FROM YOUR USE OF ANY SERVICE OR ANY MATERIAL AND/OR DATA DOWNLOADED FROM OR OTHERWISE PROVIDED THROUGH THIS WEBSITE.