Free Course Image Database Systems: SQL, Indexing, Query Optimization, Transactions and Distributed Databases

Free online courseDatabase Systems: SQL, Indexing, Query Optimization, Transactions and Distributed Databases

Duration of the online course: 33 hours and 54 minutes

New

Build real database skills: modern SQL, indexing, query optimization and transactions in a free online course with practical exercises and a certificate option.

In this free course, learn about

  • Relational model fundamentals, keys, and the benefit of physical data independence
  • Modern SQL querying, including correct GROUP BY use with aggregates
  • Disk-oriented storage layout: pages/files, slotted pages, and access pattern costs
  • LSM-trees and tradeoffs for write-heavy workloads
  • Row vs column stores, compression effects, and analytics vs OLTP performance tradeoffs
  • Buffer management and why many DBs avoid relying on OS mmap as the main mechanism
  • Hash tables in DBs, including linear probing and deletion via tombstones
  • Tree indexes (B+ trees) and why they excel at range scans; other index/filter types
  • Bloom filters: probabilistic membership tests vs ordered indexes like B+ trees
  • Index concurrency control: latches vs locks and their roles
  • Sorting, joins, and execution models (Volcano iterators, parallel exchanges, pushdown)
  • Concurrency control theory and protocols: 2PL, OCC, MVCC snapshot guarantees
  • Logging and recovery: WAL commit rules, ARIES, pageLSN purpose
  • Distributed DB architectures and replication: shared-disk vs shared-nothing, primary-replica

Course Description

Strong database knowledge is what turns basic CRUD into systems that stay fast, correct, and reliable under real-world load. This free online course helps you move beyond writing queries and into understanding how database engines actually work: how data is modeled, stored, found quickly with indexes, executed efficiently by the optimizer, and protected with transaction guarantees even when many users hit the system at once.

You will start with the relational model and the ideas that made it durable in industry, then connect those foundations to modern SQL used in production analytics and applications. From there, you will go under the hood to see why storage layout matters, why sequential access often beats random I/O, and how structures like log-structured merge trees and slotted pages shape performance characteristics. As you progress, you will learn how and why databases choose row or column formats, how memory and buffer management influence latency, and what tradeoffs appear when systems lean on the operating system versus owning their caching behavior.

Indexing and execution are treated as practical engineering topics: hash tables and tree indexes, what makes B+ trees effective for range scans, and how alternative structures such as inverted indexes or Bloom filters can accelerate specific workloads. You will also develop intuition for core operator algorithms like sorting and joins, then connect that to query execution models and parallel execution components used to scale out work across cores and machines. Along the way, exercises reinforce concepts so you can reason about why one plan is faster than another, not just memorize rules.

Correctness is as important as speed. The course explains concurrency control theory and brings it to life with two-phase locking, optimistic methods, and MVCC, showing how databases deliver isolation and consistent reads. You will then cover durability with write-ahead logging and recovery concepts, including how systems restore a consistent state after failures. Finally, you will explore distributed database architectures and replication patterns, learning the constraints and advantages of shared-disk versus shared-nothing designs and how transactional and analytical distributed systems make different tradeoffs.

By the end, you will be able to evaluate database behavior with an engineer’s mindset: predict bottlenecks, choose the right index strategy, understand what the optimizer is trying to do, and speak confidently about transactions, logging, and distributed designs in interviews and on the job.

Course content

  • Video class: #00 - Course Overview 21m
  • Video class: #01 - Relational Model 1h23m
  • Exercise: What is a key benefit of the relational model’s physical data independence?
  • Video class: #02 - Modern SQL (CMU Intro to Database Systems) 1h18m
  • Exercise: In SQL, why is a GROUP BY clause required when selecting a non-aggregated column alongside an aggregate (e.g., course_id with AVG(gpa))?
  • Video class: #03 - Database Storage: Files 1h22m
  • Exercise: In a disk-oriented database system, why is random access on nonvolatile storage typically slower than sequential access?
  • Video class: #04 - Database Storage: Log-Structured Merge Trees 1h22m
  • Exercise: What is the main purpose of using a slotted page (slot array) in a tuple-oriented storage layout?
  • Video class: #05 - Row vs. Column Storage Compression ? StarTree Database Talk (CMU Intro to Database Systems) 1h28m
  • Video class: #06 - Memory 1h23m
  • Exercise: Why do many database systems avoid using OS memory mapping (mmap) as the primary buffer management mechanism?
  • Video class: #07 - Hash Tables ? RelationalAI Database Talk (CMU Intro to Database Systems) 1h23m
  • Exercise: In linear probing hash tables, what is the standard way to handle deletions without breaking future lookups?
  • Video class: #08 - Tree Indexes: B Trees (CMU Intro to Database Systems) 1h22m
  • Exercise: Which statement best describes a key advantage of a B+ tree (vs. a B-tree) for range scans?
  • Video class: #09 - Vector Indexes, Inverted Indexes, Filters, Tries ? TiDB Talk (CMU Intro to Database Systems) 1h21m
  • Exercise: Which statement best describes a Bloom filter compared to an index like a B+ tree?
  • Video class: #10 - Index Concurrency Control (CMU Intro to Database Systems) 1h16m
  • Exercise: In database index concurrency control, what is the primary purpose of a latch (as opposed to a lock)?
  • Video class: #11 - Sorting 1h17m
  • Exercise: In external merge sort with B buffer pages available, how many pages are typically reserved for input runs during a merge pass?
  • Video class: #12 - Join Algorithms: Hash, Sort-Merge, Nested Loop Joins (CMU Intro to Database Systems) 1h14m
  • Exercise: Which join algorithm is typically preferred for large datasets when the join predicate is an equality match and the goal is to minimize disk I/O?
  • Video class: #13 - Query Execution Part 1 (CMU Intro to Database Systems) 1h23m
  • Exercise: In the iterator (Volcano) processing model, how does an operator signal that it has no more output tuples?
  • Video class: #14 - Query Execution Part 2 ? ClickHouse Database Talk (CMU Intro to Database Systems) 1h02m
  • Exercise: In parallel query execution, what is the main purpose of an exchange operator (e.g., gather)?
  • Video class: #15 - Query Planning 1h21m
  • Exercise: Which optimization change most directly reduces unnecessary work by applying a filter before a join?
  • Video class: #16 - Concurrency Control Theory ? Firebolt Database Talk (CMU Intro to Database Systems) 1h27m
  • Exercise: Which condition indicates that a schedule is conflict-serializable using a dependency (precedence) graph?
  • Video class: #17 - Two-Phase Locking Concurrency Control (CMU Intro to Database Systems) 1h05m
  • Exercise: What rule defines the transition between the growing and shrinking phases in two-phase locking (2PL)?
  • Video class: #18 - Optimistic Concurrency Control ? Weaviate Database Talk (CMU Intro to Database Systems) 1h20m
  • Exercise: In Optimistic Concurrency Control (OCC), when is a transaction typically assigned its timestamp in the described protocol?
  • Video class: #19 - Multi-Version Concurrency Control (CMU Intro to Database Systems) 1h27m
  • Exercise: In MVCC, what ensures that a transaction reads a consistent snapshot of the database as of its start time?
  • Video class: #20 - Database Logging ? Confluent Database Talk (CMU Intro to Database Systems) 1h19m
  • Exercise: In write-ahead logging (WAL), what must be guaranteed before acknowledging that a transaction has committed?
  • Video class: #21 - Database Recovery with ARIES (CMU Intro to Database Systems) 1h12m
  • Exercise: In ARIES recovery, what is the purpose of a pageLSN stored on each page?
  • Video class: #22 - Distributed Database Architectures ? DataStax Database Talk (CMU Intro to Database Systems) 1h23m
  • Exercise: In a shared-disk distributed database architecture, what is a key advantage compared to shared-nothing when adding more compute nodes?
  • Video class: #23 - Distributed Transactional Databases (CMU Intro to Database Systems) 1h24m
  • Exercise: In a primary-replica replication setup, which statement best describes how writes and reads are handled?
  • Video class: #24 - Distributed Analytical Databases (CMU Intro to Database Systems) 1h22m
  • Video class: #25 - BigQuery Snowflake Redshift Databricks DuckDB (CMU Intro to Database Systems) 1h25m
  • Exercise: Why can a branchless (predicate) scan outperform a branching scan on modern CPUs when predicate selectivity is around 50%?

This free course includes:

33 hours and 54 minutes of online video course

Digital certificate of course completion (Free)

Exercises to train your knowledge

100% free, from content to certificate

Ready to get started?Download the app and get started today.

Install the app now

to access the course
Icon representing technology and business courses

Over 5,000 free courses

Programming, English, Digital Marketing and much more! Learn whatever you want, for free.

Calendar icon with target representing study planning

Study plan with AI

Our app's Artificial Intelligence can create a study schedule for the course you choose.

Professional icon representing career and business

From zero to professional success

Improve your resume with our free Certificate and then use our Artificial Intelligence to find your dream job.

You can also use the QR Code or the links below.

QR Code - Download Cursa - Online Courses

More free courses at Databases

Free Ebook + Audiobooks! Learn by listening or reading!

Download the App now to have access to + 5000 free courses, exercises, certificates and lots of content without paying anything!

  • 100% free online courses from start to finish

    Thousands of online courses in video, ebooks and audiobooks.

  • More than 60 thousand free exercises

    To test your knowledge during online courses

  • Valid free Digital Certificate with QR Code

    Generated directly from your cell phone's photo gallery and sent to your email

Cursa app on the ebook screen, the video course screen and the course exercises screen, plus the course completion certificate