18.3 The String Type

The String type is a growable, heap-allocated UTF-8 buffer specialized for text. It’s similar to Vec<u8> but guarantees valid UTF-8 content.

18.3.1 String vs. &str

  • String: An owned, mutable text buffer. It frees its memory when it goes out of scope and can grow as needed.
  • &str: A borrowed slice of UTF-8 data, such as a literal ("Hello") or a substring of an existing String.

18.3.2 String vs. Vec<u8>

Both store bytes on the heap, but String ensures the bytes are always valid UTF-8. This makes indexing by integer offset non-trivial, since Unicode characters can span multiple bytes. When handling arbitrary binary data, use a Vec<u8> instead.

18.3.3 Creating and Combining Strings

// From a string literal or `.to_string()`
let s1 = String::from("Hello");
let s2 = "Hello".to_string();

// From other data
let number = 42;
let s3 = number.to_string(); // Produces "42"

// Empty string
let mut s4 = String::new();
s4.push_str("Hello");

Concatenation:

let s1 = String::from("Hello");
let s2 = String::from("World");

// The + operator consumes s1
let s3 = s1 + " " + &s2; 
// After this, s1 is unusable

// format! macro is often more flexible
let name = "Alice";
let greeting = format!("Hello, {}!", name); // No moves occur

18.3.4 Handling UTF-8

Indexing a String at a byte offset (s[0]) is disallowed. Instead, iterate over characters if needed:

for ch in "Hello".chars() {
    println!("{}", ch);
}

For advanced Unicode handling (e.g., grapheme clusters), you may need external crates like unicode-segmentation.

18.3.5 Common String Methods

  • push (adds a single char) and push_str (adds a &str):
    let mut s = String::from("Hello");
    s.push(' ');
    s.push_str("Rust!");
  • replace:
    let sentence = "I like apples.".to_string();
    let replaced = sentence.replace("apples", "bananas");
  • split and join:
    let fruits = "apple,banana,orange".to_string();
    let parts: Vec<&str> = fruits.split(',').collect();
    let joined = parts.join(" & ");
  • Converting to bytes:
    let bytes = "Rust".as_bytes();

18.3.6 Summary: String vs. C

C strings are typically null-terminated char * buffers. Manually resizing or copying them can be error-prone. Rust’s String automatically tracks capacity and enforces UTF-8 correctness. It also prevents out-of-bounds errors and easily expands when more space is required, freeing its allocation when the String value goes out of scope.