Safekipedia

String (computer science)

Adapted from Wikipedia Β· Adventurer experience

Illustration showing how a string is made up of individual characters in computer science.

In computer programming, a string is a sequence of characters, like letters, numbers, or symbols, that make up words, sentences, or other text. These sequences can be fixed, meaning they stay the same once made, or they can change and grow as needed. Strings are usually kept in memory using an array data structure that holds each character in order.

Strings are typically made up of characters, and are often used to store human-readable data, such as words or sentences.

Depending on the programming language, a string might use a set amount of space in memory or change its size when needed. When a string is written right in the code, it is called a string literal or an anonymous string.

In more advanced topics like formal languages, studied in mathematical logic and theoretical computer science, a string is a sequence of symbols from a group known as an alphabet. This helps computer scientists learn how languages and calculations work in a deeper way.

Purpose

Strings are used to store text that people can read, such as words and sentences. They help programs show messages to users or receive input from them. For example, a program might display a message like "file upload complete" to end users. Users can also type in text, like "I got a new job today" on a social media site, which the program stores in a database.

Strings can also hold data that isn’t meant for reading, like letters representing DNA sequences such as "AGATGCCGT" or special codes like "?action=edit" in a query string. While strings can sometimes refer to other types of data, they usually mean a sequence of characters.

History

The word "string" has been used for a long time to describe things arranged in a line. In the 1800s, people who arranged letters for printing used "string" to talk about a row of printed letters.

Later, the idea of a "string" was used in math and language studies to mean a group of symbols in a certain order. This helped people study how symbols behave in rules-based systems. One of the first computer languages to work well with strings was COMIT in the 1950s, followed by SNOBOL in the early 1960s.

String datatypes

See also: Comparison of programming languages (string functions)

A string datatype is a kind of information used in computer programming to hold groups of letters and symbols. Strings are very useful and are found in almost every programming language. Some languages see strings as primitive types, while others see them as composite types. The way a programming language works with strings can change how they are written and used.

Strings can be fixed-length, meaning they have a set size decided when the program is made. Or they can be variable-length, which can grow or shrink during the program depending on what needs to be stored. Most modern programming languages use variable-length strings, but they still depend on how much space is available.

FRANKNULkefw
4616521641164E164B1600166B16651666167716
lengthFRANKkefw
05164616521641164E164B166B16651666167716

Literal strings

Main article: String literal

Sometimes, we need to put words and sentences inside files so both people and computers can read them. This is useful when we write code or set up programs.

One easy way to do this is to put the words between quotation marks, like "hello" or 'hello'. This works in most programming languages. If we need to use special characters, like the quotation mark itself, we can use escape sequences, which usually start with a backslash. Another way is to end the word with a newline, like in some Windows files called INI files.

Non-text strings

A string in computer science can be any group of similar data, not just letters. For example, a bit string or byte string can hold binary data, like information from a computer or phone. Whether this data is kept as a string depends on what the programmer needs and what the programming language allows.

In the programming language C, there is a difference between a "string" (which always ends with a special sign) and an "array of characters" (which may not). Using some C tools on an array can sometimes cause problems later.

String processing algorithms

There are many ways to work with strings in computer programs. Each way has its own good and bad points. We can study these methods to see how fast they work and how much space they need. The term stringology started in 1984 by a computer expert to describe this study.

Some methods include finding parts of strings, changing strings, sorting them, using special patterns called regular expressions, breaking strings apart, and finding patterns in sequences. More advanced methods use special tools like suffix trees and finite-state machines.

Character string-oriented languages and utilities

Character strings are very useful in computer programs. Some languages are made to work with them easily. Examples include AWK, Icon, MUMPS, Perl, Rexx, Ruby, sed, SNOBOL, Tcl, and TTM.

Many tools on Unix systems can change and work with strings easily. Files and streams can be treated like strings, too. Some APIs such as Multimedia Control Interface, embedded SQL, or printf use strings to store commands.

Scripting languages like Perl, Python, Ruby, and Tcl use special text patterns called regular expressions to help with text tasks. Perl is well-known for this. Some languages, like Perl and Ruby, also let you add values directly into strings while writing code. This is called string interpolation.

Character string functions

See also: Comparison of programming languages (string functions)

String functions help us work with strings and change what they say. They can also tell us about a string. Different computer programming language have different names for these functions.

A simple example is the string length function. This tells us how many letters are in a string. It might be called length, len, or size. For example, length("hello world") would give us the number 11. Another common function is concatenation, which joins two strings together, often using the + sign.

Some microprocessor instruction set architectures have special commands for working with strings. An example is in intel x86m REPNZ MOVSB.

Formal theory

Strings are sequences of characters, like letters or numbers, used in computer programming. They can have a fixed length or change after they are made.

For example, with symbols like {0, 1}, "01011" is a string from those symbols. The length of a string is the number of symbols it has. The empty string has no symbols.

Strings can be combined, cut into pieces, reversed, or rotated. They can also be ordered in a specific way called lexicographical order, like in a dictionary.

Related articles

This article is a child-friendly adaptation of the Wikipedia article on String (computer science), available under CC BY-SA 4.0.

Images from Wikimedia Commons. Tap any image to view credits and license.