Port details |
- py-textract Extract text from any document
- 1.6.5_9 textproc
=2 1.6.5_8Version of this port present on the latest quarterly branch. - Maintainer: DtxdF@disroot.org
 - Port Added: 2022-10-25 20:51:06
- Last Update: 2025-03-08 04:05:21
- Commit Hash: 06a08e6
- People watching this port, also watch:: jdictionary, py311-Automat, py311-python-gdsii, py311-PyOpenGL, p5-Sane
- Also Listed In: python
- License: MIT
- WWW:
- https://github.com/deanmalmgren/textract
- Description:
- textract provides a single interface for extracting content embedded
from Word documents, PowerPoint presentations, PDFs and much more,
which can be used for further textual analysis and visualization.
¦ ¦ ¦ ¦ 
- Manual pages:
- FreshPorts has no man page information for this port.
- pkg-plist: as obtained via:
make generate-plist - There is no configure plist information for this port.
- Dependency lines:
-
- ${PYTHON_PKGNAMEPREFIX}textract>0:textproc/py-textract@${PY_FLAVOR}
- To install the port:
- cd /usr/ports/textproc/py-textract/ && make install clean
- To add the package, run one of these commands:
- pkg install textproc/py-textract
- pkg install py311-textract
NOTE: If this package has multiple flavors (see below), then use one of them instead of the name specified above. NOTE: This is a Python port. Instead of py311-textract listed in the above command, you can pick from the names under the Packages section.- PKGNAME: py311-textract
- Package flavors (<flavor>: <package>)
- distinfo:
- TIMESTAMP = 1659835075
SHA256 (textract-1.6.5.tar.gz) = 68f0f09056885821e6c43d8538987518daa94057c306679f2857cc5ee66ad850
SIZE (textract-1.6.5.tar.gz) = 17871
Packages (timestamps in pop-ups are UTC):
- Dependencies
- NOTE: FreshPorts displays only information on required and default dependencies. Optional dependencies are not covered.
- Build dependencies:
-
- py311-setuptools>=63.1.0 : devel/py-setuptools@py311
- python3.11 : lang/python311
- Test dependencies:
-
- python3.11 : lang/python311
- Runtime dependencies:
-
- py311-argcomplete>=1.10.0 : devel/py-argcomplete@py311
- py311-chardet>=3 : textproc/py-chardet@py311
- py311-six>1.12.0 : devel/py-six@py311
- antiword>0 : textproc/antiword
- py311-beautifulsoup>=4.8.0 : www/py-beautifulsoup@py311
- py311-docx2txt>=0.8 : textproc/py-docx2txt@py311
- ffmpeg>0 : multimedia/ffmpeg
- flac>0 : audio/flac
- jpeg-turbo>0 : graphics/jpeg-turbo
- lame>0 : audio/lame
- py311-libxml2>0 : textproc/py-libxml2@py311
- libxslt>=1.1.15 : textproc/libxslt
- py311-extract-msg>=0.29 : textproc/py-extract-msg@py311
- poppler-utils>0 : graphics/poppler-utils
- py311-python-pptx>=0.6.18 : textproc/py-python-pptx@py311
- pstotext>0 : print/pstotext
- sox>0 : audio/sox
- py311-speechrecognition>=3.8.1 : audio/py-speechrecognition@py311
- py311-xlrd>=1.2.0 : textproc/py-xlrd@py311
- tesseract>0 : graphics/tesseract
- unrtf>0 : textproc/unrtf
- python3.11 : lang/python311
- There are no ports dependent upon this port
Configuration Options:
- ===> The following configuration options are available for py311-textract-1.6.5_9:
ANTIWORD=on: DOC document support
BEAUTIFULSOUP=on: HTML parsing library
DOCX2TXT=on: DOCX document support
LIBXML2=on: Python interface for XML parser library
LIBXSLT=on: XML stylesheet transformation library
MSG=on: MS Outlook MSG file format support
PPTX=on: MS PowerPoint PPTX presentations support
PS=on: PostScript document support
SPREADSHEET=on: XLS and XLSX spreadsheet support
UNRTF=on: RTF document support
====> Options available for the group AUDIO
FFMPEG=on: FFmpeg support (WMA, AIFF, AC3, APE...)
FLAC=on: FLAC lossless audio codec support
LAME=on: LAME MP3 audio encoder support
POCKETSPHINX=off: Interface to CMU Sphinxbase and Pocketsphinx
SOX=on: Command-line audio processing tool
SPEECH_RECOGNITION=on: Python library for performing speech recognition
====> Options available for the group OCR
JPEG_TURBO=on: SIMD-accelerated JPEG codec
TESSERACT=on: Commercial quality open source OCR engine
====> PDF document support
PDFMINER=off: PDF parser and analyzer
PDFTOTEXT=on: Extract text from a PDF document
===> Use 'make config' to modify these settings
- Options name:
- textproc_py-textract
- USES:
- python
- FreshPorts was unable to extract/find any pkg message
- Master Sites:
|
Commit History - (may be incomplete: for full details, see links to repositories near top of page) |
Commit | Credits | Log message |
1.6.5_9 08 Mar 2025 04:05:21
    |
Charlie Li (vishwin)  |
python: bump all USE_PYTHON=distutils consumers after RUN_DEPENDS removal
Any missed ports, feel free to bump.
Any ports that need setuptools at runtime can have the devel/py-setuptools
manually added back to RUN_DEPENDS, but understand that this practice
is deprecated; see CHANGES for details. |
1.6.5_8 01 Mar 2024 23:56:15
    |
Tobias C. Berner (tcberner)  |
graphics/poppler: bump consumers of graphics/poppler
Bump after rupdate in 478df79a3071b399f648107456cf371587e84a3f |
1.6.5_7 03 Jan 2024 07:18:40
    |
Tobias C. Berner (tcberner)  |
graphics/poppler: bump revision of consumers |
1.6.5_6 27 Jun 2023 19:34:34
    |
Rene Ladan (rene)  |
all: remove explicit versions in USES=python for "3.x+"
The logic in USES=python will automatically convert this to 3.8+ by
itself.
Adjust two ports that only had Python 3.7 mentioned but build fine
on Python 3.8 too.
finance/quickfix: mark BROKEN with PYTHON
libtool: compile: c++ -DHAVE_CONFIG_H -I. -I../.. -I -I. -I.. -I../.. -I../C++
-DLIBICONV_PLUG -DPYTHON_MAJOR_VERSION=3 -Wno-unused-variable
-Wno-maybe-uninitialized -O2 -pipe -DLIBICONV_PLUG -fstack-protector-strong
-fno-strict-aliasing -DLIBICONV_PLUG -Wall -ansi
-Wno-unused-command-line-argument -Wpointer-arith -Wwrite-strings
-Wno-overloaded-virtual -Wno-deprecated-declarations -Wno-deprecated -std=c++0x
-MT _quickfix_la-QuickfixPython.lo -MD -MP -MF
.deps/_quickfix_la-QuickfixPython.Tpo -c QuickfixPython.cpp -fPIC -DPIC -o
.libs/_quickfix_la-QuickfixPython.o
warning: unknown warning option '-Wno-maybe-uninitialized'; did you mean
'-Wno-uninitialized'? [-Wunknown-warning-option]
QuickfixPython.cpp:175:11: fatal error: 'Python.h' file not found
^~~~~~~~~~
1 warning and 1 error generated.
Reviewed by: portmgr, vishwin, yuri
Differential Revision: <https://reviews.freebsd.org/D40568> |
1.6.5_6 14 May 2023 05:58:14
    |
Tobias C. Berner (tcberner)  |
graphics/poppler: bump dependencies |
1.6.5_5 25 Apr 2023 15:17:15
    |
Christian Weisgerber (naddy)  |
audio/opus: bump consumers after update to 1.4 |
1.6.5_4 20 Apr 2023 04:07:00
    |
Tobias C. Berner (tcberner)  |
graphics/poppler: bump consumers after update to 23.04
graphics/poppler was updated in 06339c451266f5843e53bd6406c81a89eedd4ab1 |
1.6.5_3 30 Jan 2023 13:02:41
    |
Po-Chuan Hsieh (sunpoet)  |
textproc/py-textract: Add NO_ARCH
- While I'm here, fix indent
Approved by: portmgr (blanket) |
1.6.5_3 30 Jan 2023 12:59:34
    |
Po-Chuan Hsieh (sunpoet)  |
audio/py-speechrecognition: Update to 3.9.0
- Update PORTNAME: use lowercase
- Change MASTER_SITES from GitHub to PYPI
- Update version requirement of RUN_DEPENDS
- Take maintainership
Changes: https://github.com/Uberi/speech_recognition/releases |
1.6.5_3 11 Jan 2023 15:58:34
    |
Dmitry Marakasov (amdmi3)  |
*/*: rename CHEESESHOP to PYPI in MASTER_SITES
PR: 267994
Differential revision: D37518
Approved by: bapt |
1.6.5_3 09 Jan 2023 12:37:17
    |
Tobias C. Berner (tcberner)  |
graphics/poppler: bump dependencies
Follow-up to 9b78681895a5a5b7225299242098f7f2f27d959c |
1.6.5_2 08 Dec 2022 05:45:34
    |
Tobias C. Berner (tcberner)  |
graphics/poppler: bump dependencies |
1.6.5_1 08 Nov 2022 05:07:17
    |
Tobias C. Berner (tcberner)  |
graphics/poppler: bump PORTREVISION of dependencies
- after update to 22.11 in d01d0d73b169 |
1.6.5 25 Oct 2022 20:49:12
    |
Li-Wen Hsu (lwhsu)  Author: Jesús Daniel Colmenares Oviedo |
Add textproc/py-textract: Extract text from any document
textract provides a single interface for extracting content embedded
from Word documents, PowerPoint presentations, PDFs and much more,
which can be used for further textual analysis and visualization.
WWW: https://github.com/deanmalmgren/textract
PR: 265768 |