Port details |
- crawl Small, efficient web crawler with advanced features
- 0.4_17 www
=4 0.4_17Version of this port present on the latest quarterly branch. - Maintainer: portmaster@BSDforge.com
 - Port Added: 2001-06-23 13:09:54
- Last Update: 2024-05-24 18:00:00
- Commit Hash: 83ee3d4
- People watching this port, also watch:: p5-File-Tail, stunnel, libunicode, p5-libwww, sox
- License: BSD4CLAUSE
- WWW:
- https://www.monkey.org/~provos/crawl/
- Description:
- The crawl utility starts a depth-first traversal of the web at the
specified URLs. It stores all JPEG images that match the configured
constraints. Crawl is fairly fast and allows for graceful termination.
After terminating crawl, it is possible to restart it at exactly
the same spot where it was terminated. Crawl keeps a persistent
database that allows multiple crawls without revisiting sites.
The main reason for writing crawl was the lack of simple open source
web crawlers. Crawl is only a few thousand lines of code and fairly
easy to debug and customize.
Some of the main features:
- Saves encountered JPEG images
- Image selection based on regular expressions and size contrainsts
- Resume previous crawl after graceful termination
- Persistent database of visited URLs
- Very small and efficient code
- Supports robots.txt
¦ ¦ ¦ ¦ 
- Manual pages:
- FreshPorts has no man page information for this port.
- pkg-plist: as obtained via:
make generate-plist - Dependency lines:
-
- To install the port:
- cd /usr/ports/www/crawl/ && make install clean
- To add the package, run one of these commands:
- pkg install www/crawl
- pkg install crawl
NOTE: If this package has multiple flavors (see below), then use one of them instead of the name specified above.- PKGNAME: crawl
- Flavors: there is no flavor information for this port.
- distinfo:
- SHA256 (crawl-0.4.tar.gz) = fdf1c49dc21598fd9a6f221b42449140e46617a7c0fd36051c10dbfdcaa33651
SIZE (crawl-0.4.tar.gz) = 111084
Packages (timestamps in pop-ups are UTC):
- Dependencies
- NOTE: FreshPorts displays only information on required and default dependencies. Optional dependencies are not covered.
- Build dependencies:
-
- pkgconf>=1.3.0_1 : devel/pkgconf
- Library dependencies:
-
- libevent.so : devel/libevent
- libdb-18.1.so : databases/db18
- There are no ports dependent upon this port
Configuration Options:
- ===> The following configuration options are available for crawl-0.4_17:
EXAMPLES=on: Build and/or install examples
===> Use 'make config' to modify these settings
- Options name:
- www_crawl
- USES:
- bdb:18 pkgconfig
- FreshPorts was unable to extract/find any pkg message
- Master Sites:
|
Commit History - (may be incomplete: for full details, see links to repositories near top of page) |
Commit | Credits | Log message |
0.4_17 24 May 2024 18:00:00
    |
Fernando Apesteguía (fernape)  Author: Chris Hutchinson |
www/crawl: update BDB to version 18
PR: 279210
Reported by: portmaster@bsdforge.com (maintainer) |
0.4_16 16 May 2024 06:19:45
    |
Fernando Apesteguía (fernape)  Author: Chris Hutchinson |
www/crawl: set MAINTAINER && remove DEPRECATED
Maintainer already maintains several ports.
PR: 278986
Reported by: portmaster@bsdforge.com |
0.4_15 04 May 2024 10:06:18
    |
Daniel Engberg (diizzy)  |
www/crawl: Fix EXPIRATION_DATE
2022-06-30 should be 2024-06-30 |
0.4_15 04 May 2024 10:01:27
    |
Daniel Engberg (diizzy)  |
www/crawl: Deprecate and set expiration date to 2024-06-30
Abandonware and obsolete, last release in 2003 and no longer developed.
Redirect users to ftp/wget |
0.4_15 26 Feb 2024 13:14:59
    |
Muhammad Moinur Rahman (bofh)  |
www/crawl: Moved man to share/man
Approved by: portmgr (blanket) |
07 Sep 2022 21:58:51
    |
Stefan Eßer (se)  |
Remove WWW entries moved into port Makefiles
Commit b7f05445c00f has added WWW entries to port Makefiles based on
WWW: lines in pkg-descr files.
This commit removes the WWW: lines of moved-over URLs from these
pkg-descr files.
Approved by: portmgr (tcberner) |
0.4_14 07 Sep 2022 21:10:59
    |
Stefan Eßer (se)  |
Add WWW entries to port Makefiles
It has been common practice to have one or more URLs at the end of the
ports' pkg-descr files, one per line and prefixed with "WWW:". These
URLs should point at a project website or other relevant resources.
Access to these URLs required processing of the pkg-descr files, and
they have often become stale over time. If more than one such URL was
present in a pkg-descr file, only the first one was tarnsfered into
the port INDEX, but for many ports only the last line did contain the
port specific URL to further information.
There have been several proposals to make a project URL available as
a macro in the ports' Makefiles, over time.
(Only the first 15 lines of the commit message are shown above ) |
0.4_14 20 Jul 2022 14:23:26
    |
Tobias C. Berner (tcberner)  |
www: remove 'Created by' lines
A big Thank You to the original contributors of these ports:
*
* <hvo.pm@xs4all.nl>
* Aaron Dalton <aaron@FreeBSD.org>
* Aaron Dalton <aaron@daltons.ca>
* Aaron LI <aly@aaronly.me>
* Aaron Zauner <az_mail@gmx.at>
* Abel Chow <achow@transoft.net>
* Adam Weinberger <adamw@FreeBSD.org>
* Ade Lovett <ade@FreeBSD.org>
* Adrian Steinmann <ast@marabu.ch>
* Akinori MUSHA aka knu <knu@idaemons.org> (Only the first 15 lines of the commit message are shown above ) |
0.4_14 06 Apr 2021 14:31:07
    |
Mathieu Arnold (mat)  |
Remove # $FreeBSD$ from Makefiles. |
0.4_14 02 Aug 2019 13:30:40
  |
jbeich  |
devel/libevent2: update to 2.1.11
Changes: https://github.com/libevent/libevent/releases/tag/release-2.1.11-stable
ABI: https://abi-laboratory.pro/tracker/timeline/libevent/
PR: 239599
Reported by: GitHub (watch releases)
Approved by: zeising (maintainer)
MFH: 2019Q3 (maybe security, partially restores 2.1.8 ABI)
Differential Revision: https://reviews.freebsd.org/D21133 |
0.4_13 25 Dec 2017 09:00:39
  |
amdmi3  |
- Switch to options helpers
- Regenerate patches
- Update WWW |
0.4_13 20 Feb 2017 02:57:04
  |
jbeich  |
devel/libevent2: drop historical suffix after r362796
PR: 216777
Approved by: mm (maintainer) |
0.4_12 04 Feb 2017 07:56:59
  |
jbeich  |
devel/libevent2: update to 2.1.8 and cleanup
- DEFAULT_VERSIONS += ssl=openssl-devel is now supported
- devel/py-event and devel/p5-Event-Lib are marked BROKEN
Changes: https://github.com/libevent/libevent/raw/release-2.1.8-stable/whatsnew-2.1.txt
Changes: https://github.com/libevent/libevent/raw/release-2.1.8-stable/ChangeLog
PR: 216527
Exp-run by: antoine
Approved by: mm (maintainer) |
0.4_11 08 Aug 2016 13:46:50
  |
mat  |
USE_BDB cleanup.
- USE_BDB=4x+ -> USES=bdb.
- USE_BDB=yes -> USES=bdb.
- USE_BDB=xx -> USES=bdb:xx.
Other modernisations when I see them.
PR: 209183
Sponsored by: Absolight |
0.4_11 01 Apr 2016 14:33:58
  |
mat  |
Remove ${PORTSDIR}/ from dependencies, categories v, w, x, y, and z.
With hat: portmgr
Sponsored by: Absolight |
0.4_11 22 May 2015 20:34:29
  |
mat  |
Remove $FreeBSD$ from patches files everywhere.
With hat: portmgr
Sponsored by: Absolight |
0.4_11 06 Mar 2015 12:08:18
  |
amdmi3  |
- Add LICENSE
- Pet portlint
- Drop @dirrm* from plist |
0.4_11 21 Aug 2014 22:50:30
  |
mandree  |
Berkeley DB cleanup, remove versions 4.0 ... 4.7.
- Mk/bsd.database.mk rewrite, new default to db5.
- db6 is eligible by default only if installed on the system.
- Bump PORTREVISION of all ports that directly depend on BerkeleyDB or
where USE_BDB is found in the port's directory
- Patch a few ports such that they will pick up or work with newer
versions.
- Add UPDATING entry
- Drive-by format fix for pks
- Drop BerkeleyDB option from mail/popular for now, requires more work.
- Exp-run logs linked from the PR below.
- Ports that do not build (IGNORE, BROKEN, etc.) have pro-forma changes
for new Berkeley DB, but are untested.
NOTE: please read UPDATING and the Wiki page before proceeding!
Announcement: http://lists.freebsd.org/pipermail/freebsd-ports-announce/2014-August/000090.html
Wiki reference: https://wiki.freebsd.org/Ports/BerkeleyDBCleanup
PR: 192690
Approved by: portmgr (implicit, PORTREVISION bump on unstaged ports) |
0.4_10 29 Jul 2014 17:12:50
  |
adamw  |
Rename all patches that contain '::' as a path separator, and use
'__' instead. |
0.4_10 24 Jul 2014 13:32:59
  |
bapt  |
Only use libevent2
Remove libevent as libevent2 is providing a good compatibility interface as well
as providing better performances.
Remove custom patches from libevent2 and install libevent2 the regular way
Mark ports abusing private fields of the libevent1 API as broken
Import a patch from fedora to have honeyd working with libevent2
Remove most of the patches necessary to find the custom installation we used to
have for libevent2
With hat: portmgr |
0.4_9 05 Mar 2014 13:21:16
  |
bapt  |
Enforce using libevent2 via the compat API (libevent2 performs way better,
libevent1 should die in short term) |
0.4_8 20 Sep 2013 23:36:54
  |
bapt  |
Add NO_STAGE all over the place in preparation for the staging support (cat:
www) |
0.4_8 19 Mar 2011 12:38:54
 |
miwi  |
- Get Rid MD5 support |
0.4_8 25 Jul 2010 15:39:20
 |
mm  |
Update libevent to 1.4.14b
PR: ports/147723
Approved by: maintainer (timeout) |
0.4_7 21 Dec 2009 02:19:12
 |
dougb  |
For ports maintained by ports@FreeBSD.org, remove names and/or
e-mail addresses from the pkg-descr file that could reasonably
be mistaken for maintainer contact information in order to avoid
confusion on the part of users looking for support. As a pleasant
side effect this also avoids confusion and/or frustration for people
who are no longer maintaining those ports. |
0.4_7 19 Aug 2008 16:40:17
 |
mnag  |
- Update libevent dependency and bump PORTREVISION |
0.4_6 02 Jan 2008 23:43:03
 |
mnag  |
- Bump PORTREVISION since devel/libevent are updated. |
0.4_5 21 Sep 2007 20:21:30
 |
mnag  |
- Change libevent lib and bump PORTREVISION since devel/libevent are updated. |
0.4_4 06 Apr 2007 18:28:46
 |
mnag  |
- Bump PORTREVISION and change LIB_DEPENDS to reflect libevent update. |
0.4_3 18 Dec 2006 15:15:01
 |
leeym  |
- utilize USE_BDB |
0.4_3 05 Dec 2006 13:29:01
 |
mnag  |
- Bump PORT_REVISION and change LIB_DEPENDS to reflect update in devel/libevent
- Fix many wrong BUILD_DEPENDS. Thanks to ldd(1) |
0.4_2 13 May 2006 04:41:22
 |
edwin  |
Remove USE_REINPLACE from categories starting with W |
0.4_2 14 Apr 2006 20:45:44
 |
linimon  |
Reset petef due to no response to email. We hope to see him back sometime.
Hat: portmgr |
0.4_2 24 Jan 2006 03:14:23
 |
edwin  |
SHA256ify
Approved by: krion@ |
0.4_2 19 Jul 2005 02:18:32
 |
kris  |
Now builds on 6.x |
0.4_2 01 Jul 2005 22:47:21
 |
jylefort  |
Chase the libevent update.
Reported by: pointyhat |
0.4_1 02 Jun 2005 20:14:55
 |
ume  |
Our getaddrinfo(3) never returns EAI_NODATA on 5.2-RELEASE and
later as RFC 3493 deprecated it. So, we have to see EAI_NONAME
instead.
Approved by: petef |
0.4 10 Apr 2005 21:35:28
 |
kris  |
BROKEN on FreeBSD 6.0: Does not compile |
0.4 31 Mar 2004 03:12:58
 |
trevor  |
SIZEify (maintainer timeout) |
0.4 12 Mar 2004 13:43:32
 |
petef  |
Unbreak on -current. |
0.4 06 Feb 2004 08:13:11
 |
kris  |
BROKEN on 5.x: does not compile |
0.4 21 Dec 2003 18:39:12
 |
petef  |
Update to 0.4. |
0.3_2 05 May 2003 03:16:01
 |
petef  |
- fix build on -current
- no longer need to depend on libmd5 from libwww, libmd does fine
- install an example crawl.conf
PR: 51696
Submitted by: Rui Lopes <rui@ruilopes.com> |
0.3_1 07 Mar 2003 06:12:57
 |
ade  |
Clear moonlight beckons.
Requiem mors pacem pkg-comment,
And be calm ports tree.
E Nomini Patri, E Fili, E Spiritu Sancti. |
0.3_1 21 Nov 2002 23:56:43
 |
edwin  |
PERL -> REINPLACE_CMD
Noticed on: bento |
0.3_1 22 Aug 2002 19:13:46
 |
ade  |
BerkeleyDB cleanup - stage 2
Update databases/db3 to 3.3.11, and fix the few ports that need sorting
after the shlib version update, and a slight API change from 3.2.x->3.3.x |
04 Feb 2002 00:39:19
    |
petef  |
Update to 0.3 |
23 Dec 2001 10:14:50
    |
petef  |
- update to 0.2 - remove redundant GNU_CONFIGURE (this is implied by
USE_AUTOCONF) |
23 Aug 2001 00:07:25
    |
petef  |
Change my email address to petef@FreeBSD.org for the ports I maintain. |
04 Jul 2001 07:23:38
    |
dwcjr  |
Update to 1.0b |
23 Jun 2001 17:09:54
    |
dwcjr  |
Add crawl, a worm that searches for jpegs |