Shortcut to seniority

Persistency refers to storing states as data in the computer data storage. Persistent data refer to the data that is loaded from disk on application startup and stored it back on disk at runtime or on application exit. Examples of persistent data would be configurations, servers, accounts, password, etc. that simplify usability to the user.

In programming, the software layer that allows the application to persist its state is generically called the persistence layer, and they usually rely on object serialization and deserialization.

Most of the software applications also require some sort of data storage, and that is where databases kick in.

A database is a collection of information that is organized in a way so that it can be easily accessed, managed, and updated. The data is organized into rows, columns, and tables, and it is indexed to make it easier to find relevant information.

The Database Administrator is responsible for setting up, maintaining, securing, optimizing, and monitoring the databases, and also setting up database schemas or writing stored procedures.

As a developer, you will also need to invest some time in learning how to work your way with databases, such as:

Install and set up a database
Create and restore backups
Create tables, schemas
Create store procedure
Write some basic SQL Code for basic operations (queries, inserts, updates, etc.)
Join tables

Specific terms

Schema

A schema is a blueprint for the database. It defines the tables and in what format we store the data.

With a schema, you should be able to construct a complete empty copy of that database.

Stored procedures

Stored procedures are functions / methods for the databases, or simply put, a snippet of SQL code that lives directly on the database, and that can be called to interact with the database.

Properties of database transaction

BASE

The CAP theorem states that a distributed computer system cannot guarantee all of the following three properties at the same time: Consistency, Availability, and Partition tolerance. A BASE system gives up on consistency.

Basically Available

Basically available indicates that the system does guarantee availability.

Soft state

Soft state indicates that the state of the system may change over time, even without input. This is because of the eventual consistency model. Stores don’t have to be write-consistent or mutually consistent all the time.

Eventual consistency

Eventual consistency indicates that the system will become consistent over time, given that the system doesn’t receive input during that time => updates will eventually ripple through to all servers, given enough time.

Eventual consistency is a weak guarantee, and does not make safety guarantees: an eventually consistent system can return any value before it converges.

ACID

Some databases offer ACID (atomicity, consistency, isolation and durability) compliance to guarantee that data is consistent and that transactions are complete.

Atomicity

A transaction is often composed of multiple statement. Through atomicity, we have the guarantee that each transaction either completely succeeds, or completely fails. If any of the statements within the transaction fails, the entire transaction fails, and the database is left unchanged. The guarantee occurs in each and every situation, including power failures, errors and crashes.

Consistency

The consistency prevents database corruption, by ensuring that each transation can only bring the database from one valid state to another.

Isolation

Transactions are often executed concurrently (multiple reading and writing to multiple tables at the same time), and through isolation, we can ensure that the execution of such transactions leaves the database in the same state that would have been obtained if they were executed sequentially.

Durability

Durability guarantees that a commited transaction will remain commited even in the case of a system failure (power outage, crash).

Relational vs Non-relational databases

There are two main types of database: SQL and NoSQL – or relational databases and non-relational databases.

image with caption — Figure xxx - Databases - Relational vs Non-relational Databases

Database Replication

Relational databases

Relational databases are structures, like phone books that store phone numbers and addresses.

A relational database consists of two or more tables with columns and rows.

Each row represents an entry, and each column sorts a specific type of information. The relationship between tables and field types is called a schema, and must be clearly defined before any information is added to the table.

For a relational database to be effective, the data you’re storing in it has to be structured in a very organized way. A well-designed schema minimizes data redundancy and prevents tables from becoming out-of-sync.

Non-relational Databases

Non-relational databases are document-oriented, like folders that hold everything for a given person, such as an address, phone number, etc.

With No-SQL (Not Only SQL), non-structured data can easily be found, but it is not categorized into fields like in a relational database. No-SQL provides an ease of access, and they often provide APIs which allows developers to execute queries without having to learn SQL or understand the underlying architecture of the database.

Replication

Replication is the process of making a copy (a replica) of something, and it is related to sharing information, in order to ensure consistency, to improve reliability, accessibility, or fault-tolerance.

Replication can be done through:

Data replication: Same data is stored on multiple storage devices
Computation replication: Same task is executed many times
Replicated in space: Tasks are executed on separate devices
Replicated in time: Tasks are executed repeatedly on one device

Database Replication

Database replication is used on many database management systems, usually involving a master-slave replication schema. The master logs the updates and send them to the slaves, while the slave notifies the master when the update has been successfully received and applied, allowing the master to send subsequent updates.

There are two database replication schemas:

Master-slave scheme

Master-slave scheme is usually seen in high-availability clusters, when one master is designated to process all the requests, while the slave databases are synchronized to it.

Multi-primary (multi-master) scheme

Multi-primary scheme is a method which allows data to be stored by a group of computers. Any replica can process a request and then distribute that state to all the other members. The system is responsible for propagating the modifications, and resolving any conflicts that might occur.

Shortcut to seniority

Badea Robert