CVE-2026-28350

NameCVE-2026-28350
Descriptionlxml_html_clean is a project for HTML cleaning functionalities copied from `lxml.html.clean`. Prior to version 0.4.4, the <base> tag passes through the default Cleaner configuration. While page_structure=True removes html, head, and title tags, there is no specific handling for <base>, allowing an attacker to inject it and hijack relative links on the page. This issue has been patched in version 0.4.4.
SourceCVE (at NVD; CERT, ENISA, LWN, oss-sec, fulldisc, Debian ELTS, Red Hat, Ubuntu, Gentoo, SUSE bugzilla/CVE, GitHub advisories/code/issues, web search, more)

Vulnerable and fixed packages

The table below lists information on source packages.

Source PackageReleaseVersionStatus
lxml (PTS)bullseye (security), bullseye4.6.3+dfsg-0.1+deb11u1vulnerable
bookworm4.9.2-1vulnerable
trixie5.4.0-1fixed
forky, sid6.1.0-1fixed
lxml-html-clean (PTS)trixie0.4.2-1vulnerable
forky0.4.4-1fixed
sid0.4.5-1fixed

The information below is based on the following data on fixed versions.

PackageTypeReleaseFixed VersionUrgencyOriginDebian Bugs
lxmlsource(unstable)5.2.0-1
lxml-html-cleansource(unstable)0.4.4-1

Notes

[trixie] - lxml-html-clean <no-dsa> (Minor issue)
https://github.com/fedora-python/lxml_html_clean/security/advisories/GHSA-xvp8-3mhv-424c
Fixed by: https://github.com/fedora-python/lxml_html_clean/commit/9c5612ca33b941eec4178abf8a5294b103403f34 (0.4.4)
lxml-html-clean was split out of lxml in 5.2.0

Search for package or bug name: Reporting problems