How big is it |
It was a few days ago that I was thinking about search engines crawling through this website.
I began to wonder just how many web pages there are here. To calculate this total, it's not
just a simple matter of counting files on disk. Most of the web pages are created from entries
in the database. One recent evening, I started to design a formula to find out how many web pages
there are. Roughly. This will not be 100% accurate, but it will be close.
|
Pages on disk |
First, let's count the number of pages on disk:
$ ls *.php | wc -l
69
|
Number of categories |
There is a page for each category:
# select count(*) from categories;
count
-------
120 (1 row)
|
Number of ports |
There are ports, and there are deleted ports. I'll show both:
# select count(*) from ports_all where status = 'A';
count
-------
33169
# select count(*) from ports_all where status = 'D';
count
-------
23908 (1 row)
|
Number of files in the ports tree |
There is a page for each file in the ports tree:
[dan@ngaio:/usr/ports] $ find . | wc -l
169943
[dan@ngaio:/usr/ports] $
Count last performed at Fri, 20 Dec 2024 03:01:08 GMT |
Number of commits |
There is a page for each commit:
# select count(*) from commit_log;
count
-------
680262 (1 row)
|
Number of ports for each commit |
For each commit, you can view the files modified by that commit for a particular port:
# select count(*) from commit_log_ports;
count
-------
1514840 (1 row)
|
How many days? |
For each day, there is a page showing the commits for that day. How many days do we have?
# select count(distinct commit_date) from commit_log;
count
-------
8959 (1 row)
|
How many users? |
Each user has a page:
# select count(*) from users;
count
-------
14966 (1 row)
|
How many watch lists? |
For each watch list, there is a page:
# select count(*) from watch_list;
count
-------
15439 (1 row)
|
Estimated total |
That gives a grand total of 2,461,675 pages. On my last count, that's
about 0.030549% of the
web pages on Google1
Notes
- These statistics are updated daily.
- 1The number of Google pages used in this calculation is 8,058,044,651.
|
How much diskspace? |
The total space used by the FreshPorts database is:
# select pg_database_size('freshports.org');
pg_database_size
------------------
56,866,853,347 (1 row)
That's bytes...
This value might be easier to parse: 53 GB |