Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@1.79.0
- uses: dtolnay/rust-toolchain@1.86.0
- run: cargo check

minimal-versions:
Expand Down
4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ repository = "https://github.com/tafia/quick-xml"
keywords = ["xml", "serde", "parser", "writer", "html"]
categories = ["asynchronous", "encoding", "parsing", "parser-implementations"]
license = "MIT"
rust-version = "1.79"
rust-version = "1.86"
# We exclude tests & examples & benches to reduce the size of a package.
# Unfortunately, this is source of warnings in latest cargo when packaging:
# > warning: ignoring {context} `{name}` as `{path}` is not included in the published package
Expand Down Expand Up @@ -123,7 +123,7 @@ async-tokio = ["tokio"]
## let mut buf = Vec::new();
## let mut unsupported = false;
## loop {
## if !reader.decoder().encoding().is_ascii_compatible() {
## if !reader.encoding().is_ascii_compatible() {
## unsupported = true;
## break;
## }
Expand Down
57 changes: 49 additions & 8 deletions Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,57 @@

## Unreleased

### New Features

### Bug Fixes
This is a large release. The primary change is an ergonomic improvement across the entire API -
the quick_xml now makes use of `&str` and `String` types where possible instead of
`&[u8]` and `Vec<u8>`. This requires significant refactoring of downstream code,
but should result in a net simplification as well as potential performance improvements.

The MSRV has been raised to 1.86.

### Breaking Changes
Comment thread
dralley marked this conversation as resolved.

- [#963]: Reader now validates that input is valid UTF-8 when constructing events.
Non-UTF-8 input passed to `Reader::from_reader()` without `DecodingReader` will now
produce `Error::Encoding` instead of silently passing through invalid bytes.
Use `DecodingReader` to transcode non-UTF-8 sources.
- [#963]: Name types (`QName`, `LocalName`, `Prefix`, `Namespace`, `PrefixDeclaration`)
now wrap `&str` instead of `&[u8]`. `into_inner()` returns `&str`, and `AsRef<str>`
is implemented (`AsRef<[u8]>` has been removed). `ResolveResult::Unknown` now contains `String`
instead of `Vec<u8>`, and `NamespaceError` variants contain `String` instead of `Vec<u8>`.
- [#963]: Removed the `decoder: Decoder` field from event types (`BytesStart`, `BytesText`,
`BytesCData`, `BytesRef`) and `Attributes`. The `decoder()` method is no longer available
on these types. Decode methods on events now always assume UTF-8 input.
`Error::missed_end()` no longer takes a `Decoder` parameter.
- [#963]: Event types (`BytesStart`, `BytesEnd`, `BytesText`, `BytesCData`, `BytesPI`,
`BytesRef`) now store `Cow<str>` internally instead of `Cow<[u8]>`. `into_inner()` on
`BytesText`, `BytesCData`, `BytesPI`, and `BytesRef` now returns `Cow<str>`.
`BytesStart::set_name()` now takes `&str` instead of `&[u8]`.
- [#963]: All event types and the `Event` enum now implement `Deref<Target = str>`
instead of `Deref<Target = [u8]>`. Explicit `AsRef<str>` impls are provided to
avoid ambiguity.
- [#963]: Removed `decode()` methods from `BytesText`, `BytesCData`, and `BytesRef`.
Content is already available as `&str` via `Deref`. The `xml10_content()`,
`xml11_content()`, `xml_content()`, and `html_content()` methods now return
`Cow<str>` directly instead of `Result<Cow<str>, EncodingError>`.
- [#963]: `Attribute::value` is now `Cow<'a, str>` instead of `Cow<'a, [u8]>`.
The `From<(&[u8], &[u8])>` impl has been removed.
- [#963]: `BytesDecl::version()`, `encoding()`, and `standalone()` now return
`Cow<'_, str>` instead of `Cow<'_, [u8]>`.
- [#963]: Removed `Reader::decoder()` method. Use `Reader::encoding()` instead
(available with the `encoding` feature). Removed `decoder()` from the `XmlRead`
serde trait. Removed all methods from `Decoder` (the struct is kept only for
backward compatibility with deprecated `Attribute` methods).

### Misc Changes

- [#963]: MSRV bumped to 1.86 (April 2025)
- [#963]: Deprecated `Attribute` methods that take a `Decoder` parameter, since
attribute values are now always valid UTF-8: `decoded_and_normalized_value()`,
`decoded_and_normalized_value_with()`, `decode_and_unescape_value()`, and
`decode_and_unescape_value_with()`. Use `normalized_value()` and
`normalized_value_with()` instead.

[#963]: https://github.com/tafia/quick-xml/pull/963

## 0.41.0 -- 2026-06-29

Expand Down Expand Up @@ -48,7 +93,6 @@
[#969]: https://github.com/tafia/quick-xml/issues/969
[#970]: https://github.com/tafia/quick-xml/issues/970


## 0.40.1 -- 2026-05-15

### Bug Fixes
Expand All @@ -65,14 +109,11 @@

[#964]: https://github.com/tafia/quick-xml/pull/964

### Misc Changes


## 0.40.0 -- 2026-05-11

MSRV bumped to 1.79.

Now `quick-xml` supports the UTF-16 encoded documents. See the new `DecodingReader` type.
Now `quick-xml` supports UTF-16 encoded documents. See the new `DecodingReader` type.

### New Features

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
[![Crate](https://img.shields.io/crates/v/quick-xml.svg)](https://crates.io/crates/quick-xml)
[![docs.rs](https://docs.rs/quick-xml/badge.svg)](https://docs.rs/quick-xml)
[![codecov](https://img.shields.io/codecov/c/github/tafia/quick-xml)](https://codecov.io/gh/tafia/quick-xml)
[![MSRV](https://img.shields.io/badge/rustc-1.79.0+-ab6000.svg)](https://blog.rust-lang.org/2024/06/13/Rust-1.79.0/)
[![MSRV](https://img.shields.io/badge/rustc-1.86.0+-ab6000.svg)](https://blog.rust-lang.org/2025/04/03/Rust-1.86.0/)

High performance xml pull reader/writer.

Expand Down
16 changes: 8 additions & 8 deletions benches/macrobenches.rs
Original file line number Diff line number Diff line change
Expand Up @@ -54,11 +54,11 @@ fn parse_document_from_str(doc: &str) -> XmlResult<()> {
}
Event::Start(e) | Event::Empty(e) => {
for attr in e.attributes() {
black_box(attr?.decoded_and_normalized_value(version, r.decoder())?);
black_box(attr?.normalized_value(version)?);
}
}
Event::Text(e) => {
black_box(e.xml10_content()?);
black_box(e.xml10_content());
}
Event::CData(e) => {
black_box(e.into_inner());
Expand All @@ -82,11 +82,11 @@ fn parse_document_from_bytes(doc: &[u8]) -> XmlResult<()> {
}
Event::Start(e) | Event::Empty(e) => {
for attr in e.attributes() {
black_box(attr?.decoded_and_normalized_value(version, r.decoder())?);
black_box(attr?.normalized_value(version)?);
}
}
Event::Text(e) => {
black_box(e.xml10_content()?);
black_box(e.xml10_content());
}
Event::CData(e) => {
black_box(e.into_inner());
Expand All @@ -111,11 +111,11 @@ fn parse_document_from_str_with_namespaces(doc: &str) -> XmlResult<()> {
(resolved_ns, Event::Start(e) | Event::Empty(e)) => {
black_box(resolved_ns);
for attr in e.attributes() {
black_box(attr?.decoded_and_normalized_value(version, r.decoder())?);
black_box(attr?.normalized_value(version)?);
}
}
(resolved_ns, Event::Text(e)) => {
black_box(e.xml10_content()?);
black_box(e.xml10_content());
black_box(resolved_ns);
}
(resolved_ns, Event::CData(e)) => {
Expand All @@ -142,11 +142,11 @@ fn parse_document_from_bytes_with_namespaces(doc: &[u8]) -> XmlResult<()> {
(resolved_ns, Event::Start(e) | Event::Empty(e)) => {
black_box(resolved_ns);
for attr in e.attributes() {
black_box(attr?.decoded_and_normalized_value(version, r.decoder())?);
black_box(attr?.normalized_value(version)?);
}
}
(resolved_ns, Event::Text(e)) => {
black_box(e.xml10_content()?);
black_box(e.xml10_content());
black_box(resolved_ns);
}
(resolved_ns, Event::CData(e)) => {
Expand Down
4 changes: 2 additions & 2 deletions benches/microbenches.rs
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ fn one_event(c: &mut Criterion) {
config.trim_text(true);
config.check_end_names = false;
match r.read_event() {
Ok(Event::Comment(e)) => nbtxt += e.xml10_content().unwrap().len(),
Ok(Event::Comment(e)) => nbtxt += e.xml10_content().len(),
something_else => panic!("Did not expect {:?}", something_else),
};

Expand Down Expand Up @@ -225,7 +225,7 @@ fn attributes(c: &mut Criterion) {
let mut count = black_box(0);
loop {
match r.read_event() {
Ok(Event::Empty(e)) if e.name() == QName(b"player") => {
Ok(Event::Empty(e)) if e.name() == QName("player") => {
for name in ["num", "status", "avg"] {
if let Some(_attr) = e.try_get_attribute(name).unwrap() {
count += 1
Expand Down
49 changes: 12 additions & 37 deletions examples/custom_entities.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,14 @@

use std::borrow::Cow;
use std::collections::{HashMap, VecDeque};
use std::str::from_utf8;

use quick_xml::encoding::Decoder;
use quick_xml::errors::Error;
use quick_xml::escape::EscapeError;
use quick_xml::events::{BytesEnd, BytesStart, BytesText, Event};
use quick_xml::name::QName;
use quick_xml::reader::Reader;
use quick_xml::XmlVersion;
use regex::bytes::Regex;
use regex::Regex;

use pretty_assertions::assert_eq;

Expand All @@ -31,7 +29,7 @@ struct MyReader<'i> {
readers: VecDeque<Reader<&'i [u8]>>,
/// Map of captured internal _parsed general entities_. _Parsed_ means that
/// value of the entity is parsed by XML reader
entities: HashMap<&'i [u8], &'i [u8]>,
entities: HashMap<&'i str, &'i str>,
/// In this example we use simple regular expression to capture entities from DTD.
/// In real application you should use DTD parser.
entity_re: Regex,
Expand Down Expand Up @@ -71,7 +69,7 @@ impl<'i> MyReader<'i> {
self.readers.push_back(reader);
return Ok(Event::Text(BytesText::from_escaped(ch.to_string())));
}
let mut r = Reader::from_reader(self.resolve(&e)?);
let mut r = Reader::from_reader(self.resolve(&e)?.as_bytes());
*r.config_mut() = reader.config().clone();

self.readers.push_back(reader);
Expand Down Expand Up @@ -100,33 +98,20 @@ impl<'i> MyReader<'i> {
Cow::Owned(_) => unreachable!("We are sure that event will be borrowed"),
};
for cap in self.entity_re.captures_iter(doctype) {
self.entities.insert(
cap.get(1).unwrap().as_bytes(),
cap.get(2).unwrap().as_bytes(),
);
self.entities
.insert(cap.get(1).unwrap().as_str(), cap.get(2).unwrap().as_str());
}
}

fn resolve(&self, entity: &[u8]) -> Result<&'i [u8], EscapeError> {
fn resolve(&self, entity: &str) -> Result<&'i str, EscapeError> {
match self.entities.get(entity) {
Some(replacement) => Ok(replacement),
None => Err(EscapeError::UnrecognizedEntity(
0..0,
String::from_utf8_lossy(entity).into_owned(),
)),
None => Err(EscapeError::UnrecognizedEntity(0..0, entity.to_owned())),
}
}

fn get_entity(&self, entity: &str) -> Option<&'i str> {
self.entities
.get(entity.as_bytes())
// SAFETY: We are sure that slices are correct UTF-8 because we get
// them from rust string
.map(|value| from_utf8(value).unwrap())
}

fn decoder(&self) -> Decoder {
self.readers.back().unwrap().decoder()
self.entities.get(entity).copied()
}
}

Expand Down Expand Up @@ -154,14 +139,9 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut attrs = e.attributes();

let label = attrs.next().unwrap()?;
assert_eq!(label.key, QName(b"label"));
assert_eq!(label.key, QName("label"));
assert_eq!(
label.decoded_and_normalized_value_with(
XmlVersion::Implicit1_0,
reader.decoder(),
9,
|e| reader.get_entity(e)
)?,
label.normalized_value_with(XmlVersion::Implicit1_0, 9, |e| reader.get_entity(e))?,
"Message: hello world"
);

Expand Down Expand Up @@ -190,14 +170,9 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut attrs = e.attributes();

let attr = attrs.next().unwrap()?;
assert_eq!(attr.key, QName(b"attr"));
assert_eq!(attr.key, QName("attr"));
assert_eq!(
attr.decoded_and_normalized_value_with(
XmlVersion::Implicit1_0,
reader.decoder(),
9,
|e| { reader.get_entity(e) }
)?,
attr.normalized_value_with(XmlVersion::Implicit1_0, 9, |e| { reader.get_entity(e) })?,
"Message: hello world"
);

Expand Down
13 changes: 5 additions & 8 deletions examples/nested_readers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ fn main() -> Result<(), quick_xml::Error> {
loop {
match reader.read_event_into(&mut buf)? {
Event::Start(element) => {
if let b"w:tbl" = element.name().as_ref() {
if element.name().as_ref() == "w:tbl" {
count += 1;
let mut stats = TableStat {
index: count,
Expand All @@ -35,20 +35,17 @@ fn main() -> Result<(), quick_xml::Error> {
skip_buf.clear();
match reader.read_event_into(&mut skip_buf)? {
Event::Start(element) => match element.name().as_ref() {
b"w:tr" => {
"w:tr" => {
stats.rows.push(vec![]);
row_index = stats.rows.len() - 1;
}
b"w:tc" => {
stats.rows[row_index].push(
String::from_utf8(element.name().as_ref().to_vec())
.unwrap(),
);
"w:tc" => {
stats.rows[row_index].push(element.name().as_ref().to_string());
}
_ => {}
},
Event::End(element) => {
if element.name().as_ref() == b"w:tbl" {
if element.name().as_ref() == "w:tbl" {
found_tables.push(stats);
break;
}
Expand Down
1 change: 0 additions & 1 deletion examples/read_buffered.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ fn main() -> Result<(), quick_xml::Error> {
match reader.read_event_into(&mut buf) {
Ok(Event::Start(ref e)) => {
let name = e.name();
let name = reader.decoder().decode(name.as_ref())?;
println!("read start event {:?}", name.as_ref());
count += 1;
}
Expand Down
Loading
Loading