18.3 The String
Type
The String
type is a growable, heap-allocated UTF-8 buffer specialized for text. It’s similar to Vec<u8>
but guarantees valid UTF-8 content.
18.3.1 String
vs. &str
String
: An owned, mutable text buffer. It frees its memory when it goes out of scope and can grow as needed.&str
: A borrowed slice of UTF-8 data, such as a literal ("Hello"
) or a substring of an existingString
.
18.3.2 String
vs. Vec<u8>
Both store bytes on the heap, but String
ensures the bytes are always valid UTF-8. This makes indexing by integer offset non-trivial, since Unicode characters can span multiple bytes. When handling arbitrary binary data, use a Vec<u8>
instead.
18.3.3 Creating and Combining Strings
// From a string literal or `.to_string()`
let s1 = String::from("Hello");
let s2 = "Hello".to_string();
// From other data
let number = 42;
let s3 = number.to_string(); // Produces "42"
// Empty string
let mut s4 = String::new();
s4.push_str("Hello");
Concatenation:
let s1 = String::from("Hello");
let s2 = String::from("World");
// The + operator consumes s1
let s3 = s1 + " " + &s2;
// After this, s1 is unusable
// format! macro is often more flexible
let name = "Alice";
let greeting = format!("Hello, {}!", name); // No moves occur
18.3.4 Handling UTF-8
Indexing a String
at a byte offset (s[0]
) is disallowed. Instead, iterate over characters if needed:
for ch in "Hello".chars() {
println!("{}", ch);
}
For advanced Unicode handling (e.g., grapheme clusters), you may need external crates like unicode-segmentation
.
18.3.5 Common String
Methods
push
(adds a singlechar
) andpush_str
(adds a&str
):let mut s = String::from("Hello"); s.push(' '); s.push_str("Rust!");
replace
:let sentence = "I like apples.".to_string(); let replaced = sentence.replace("apples", "bananas");
split
andjoin
:let fruits = "apple,banana,orange".to_string(); let parts: Vec<&str> = fruits.split(',').collect(); let joined = parts.join(" & ");
- Converting to bytes:
let bytes = "Rust".as_bytes();
18.3.6 Summary: String
vs. C
C strings are typically null-terminated char *
buffers. Manually resizing or copying them can be error-prone. Rust’s String
automatically tracks capacity and enforces UTF-8 correctness. It also prevents out-of-bounds errors and easily expands when more space is required, freeing its allocation when the String
value goes out of scope.