Shortcut to seniority

Home
Go to main page

Section level: Junior
A journey into the programming realm

Section level: Intermediate
The point of no return

Section level: Senior
Leaping into the unknown
Go to main page
A journey into the programming realm
The point of no return
Leaping into the unknown
Persistency refers to storing states as data in the computer data storage. Persistent data refer to the data that is loaded from disk on application startup and stored it back on disk at runtime or on application exit. Examples of persistent data would be configurations, servers, accounts, password, etc. that simplify usability to the user.
In programming, the software layer that allows the application to persist its state is generically called the persistence layer, and they usually rely on object serialization and deserialization.
Most of the software applications also require some sort of data storage, and that is where databases kick in.
A database is a collection of information that is organized in a way so that it can be easily accessed, managed, and updated. The data is organized into rows, columns, and tables, and it is indexed to make it easier to find relevant information.
The Database Administrator is responsible for setting up, maintaining, securing, optimizing, and monitoring the databases, and also setting up database schemas or writing stored procedures.
As a developer, you will also need to invest some time in learning how to work your way with databases, such as:
A schema is a blueprint for the database. It defines the tables and in what format we store the data.
With a schema, you should be able to construct a complete empty copy of that database.
Stored procedures are functions / methods for the databases, or simply put, a snippet of SQL code that lives directly on the database, and that can be called to interact with the database.
The CAP theorem states that a distributed computer system cannot guarantee all of the following three properties at the same time: Consistency, Availability, and Partition tolerance. A BASE system gives up on consistency.
Basically available indicates that the system does guarantee availability.
Soft state indicates that the state of the system may change over time, even without input. This is because of the eventual consistency model. Stores don’t have to be write-consistent or mutually consistent all the time.
Eventual consistency indicates that the system will become consistent over time, given that the system doesn’t receive input during that time => updates will eventually ripple through to all servers, given enough time.
Eventual consistency is a weak guarantee, and does not make safety guarantees: an eventually consistent system can return any value before it converges.
Some databases offer ACID (atomicity, consistency, isolation and durability) compliance to guarantee that data is consistent and that transactions are complete.
A transaction is often composed of multiple statement. Through atomicity, we have the guarantee that each transaction either completely succeeds, or completely fails. If any of the statements within the transaction fails, the entire transaction fails, and the database is left unchanged. The guarantee occurs in each and every situation, including power failures, errors and crashes.
The consistency prevents database corruption, by ensuring that each transation can only bring the database from one valid state to another.
Transactions are often executed concurrently (multiple reading and writing to multiple tables at the same time), and through isolation, we can ensure that the execution of such transactions leaves the database in the same state that would have been obtained if they were executed sequentially.
Durability guarantees that a commited transaction will remain commited even in the case of a system failure (power outage, crash).
There are two main types of database: SQL and NoSQL – or relational databases and non-relational databases.
Relational databases are structures, like phone books that store phone numbers and addresses.
A relational database consists of two or more tables with columns and rows.
Each row represents an entry, and each column sorts a specific type of information. The relationship between tables and field types is called a schema, and must be clearly defined before any information is added to the table.
For a relational database to be effective, the data you’re storing in it has to be structured in a very organized way. A well-designed schema minimizes data redundancy and prevents tables from becoming out-of-sync.
Non-relational databases are document-oriented, like folders that hold everything for a given person, such as an address, phone number, etc.
With No-SQL (Not Only SQL), non-structured data can easily be found, but it is not categorized into fields like in a relational database. No-SQL provides an ease of access, and they often provide APIs which allows developers to execute queries without having to learn SQL or understand the underlying architecture of the database.
Replication is the process of making a copy (a replica) of something, and it is related to sharing information, in order to ensure consistency, to improve reliability, accessibility, or fault-tolerance.
Replication can be done through:
Database replication is used on many database management systems, usually involving a master-slave replication schema. The master logs the updates and send them to the slaves, while the slave notifies the master when the update has been successfully received and applied, allowing the master to send subsequent updates.
There are two database replication schemas:
Master-slave scheme
Master-slave scheme is usually seen in high-availability clusters, when one master is designated to process all the requests, while the slave databases are synchronized to it.
Multi-primary (multi-master) scheme
Multi-primary scheme is a method which allows data to be stored by a group of computers. Any replica can process a request and then distribute that state to all the other members. The system is responsible for propagating the modifications, and resolving any conflicts that might occur.
First two were initially introduced in Chapter 5 – Problem solving
Priority queue (max heap): Elements are inserted based on their priority, thus the most important message is always the first one to be taken from the queue.