Port details on branch 2024Q4 |
- libtextcat Language guessing by N-Gram-Based Text Categorization
- 2.2_6 textproc =6 2.2_6Version of this port present on the latest quarterly branch.
- Maintainer: thierry@FreeBSD.org
- Port Added: 2006-12-04 22:04:18
- Last Update: 2022-09-07 21:58:51
- Commit Hash: fb16dfe
- People watching this port, also watch:: python, vim, libtool, bsh, libXdamage
- License: BSD3CLAUSE
- WWW:
- https://software.wise-guys.nl/libtextcat/
- Description:
- Libtextcat is a library with functions that implement the classification
technique described in Cavnar & Trenkle, "N-Gram-Based Text Categorization" [1].
It was primarily developed for language guessing, a task on which it is known to
perform with near-perfect accuracy.
The central idea of the Cavnar & Trenkle technique is to calculate a
"fingerprint" of a document with an unknown category, and compare this with the
fingerprints of a number of documents of which the categories are known. The
categories of the closest matches are output as the classification. A
fingerprint is a list of the most frequent n-grams occurring in a document,
ordered by frequency. Fingerprints are compared with a simple out-of-place
metric.
[1] The document that started it all: William B. Cavnar & John M. Trenkle (1994)
N-Gram-Based Text Categorization, <http://citeseer.ist.psu.edu/68861.html>.
- ¦ ¦ ¦ ¦
- Manual pages:
- FreshPorts has no man page information for this port.
- pkg-plist: as obtained via:
make generate-plist - Dependency lines:
-
- libtextcat>0:textproc/libtextcat
- To install the port:
- cd /usr/ports/textproc/libtextcat/ && make install clean
- To add the package, run one of these commands:
- pkg install textproc/libtextcat
- pkg install libtextcat
NOTE: If this package has multiple flavors (see below), then use one of them instead of the name specified above.- PKGNAME: libtextcat
- Flavors: there is no flavor information for this port.
- distinfo:
- SHA256 (libtextcat-2.2.tar.gz) = 5677badffc48a8d332e345ea4fe225e3577f53fc95deeec8306000b256829655
SIZE (libtextcat-2.2.tar.gz) = 540999
Packages (timestamps in pop-ups are UTC):
- This port has no dependencies.
- This port is required by:
- for Libraries
-
- editors/openoffice-4
- editors/openoffice-devel
-
Deleted ports which required this port:
- * - deleted ports are only shown under the This port is required by section. It was harder to do for the Required section. Perhaps later...
Configuration Options:
- ===> The following configuration options are available for libtextcat-2.2_6:
DOCS=on: Build and/or install documentation
===> Use 'make config' to modify these settings
- Options name:
- textproc_libtextcat
- USES:
- libtool
- FreshPorts was unable to extract/find any pkg message
- Master Sites:
|