Guide
Software Supply Chain Integrity
·
Download
Search
Back to Cloudsmith
Start your free trial
trafilatura
2.0.0
last stable release 1 year ago
Submit Feedback
Source Code
See on PyPI
Install
Complexity Score
High
Open Issues
N/A
Dependent Projects
45
Weekly Downloads
global
460,921
Keywords
corpus
html2text
news-crawler
natural-language-processing
scraper
tei-xml
text-extraction
webscraping
web-scraping
article-extractor
corpus-builder
corpus-tools
crawler
html-to-markdown
llm
news-aggregator
nlp
rag
readability
rss-feed
scraping
tei
text-cleaning
text-mining
text-preprocessing
License
Apache-2.0
Yes
attribution
Permissive
linking
Permissive
distribution
Permissive
modification
Yes
patent grant
Yes
private use
Permissive
sublicensing
No
trademark grant
Downloads
Loading Weekly Download Data
Readme
Loading Readme
Dependencies
Runtime
Development
Loading dependencies...
59
Quality
CVE Issues
Active
0
Scorecards Score
4.70
Test Coverage
No Data
Follows Semver
No
Github Stars
3,867
Dependencies
total
22
Dependencies
Outdated
0
Dependencies
Deprecated
0
Threat Modelling
No Data
Repo Audits
No Data
19
Maintenance
80
Docs
Learn how to distribute trafilatura in your own private PyPI registry
$
p
i
p
i
n
s
t
a
l
l
t
r
a
f
i
l
a
t
u
r
a
/
Processing...
✓
Done
Start your free trial
Releases
2.0.0
Stable version
1
year ago
Released
Loading Version Data
PyPI on Cloudsmith
Getting started with PyPI on Cloudsmith is fast and easy.
Learn more about PyPI on Cloudsmith
View the Cloudsmith + Python Docs