Muchang Bahng

Computer Science

If you were to invent computer science from scratch, how would I do it? We can divide them into three chronological pillars. First is the theory of algorithms. Second is the construction of computers. Third is the applications of algorithms.

The design and analysis of algorithms are much older than computers in general, having existed as early as 2000 BC when Egyptians developed methods to multiply two numbers (even more primitive cookbooks can be considered algorithms). But in order to do any algorithm, we need a way to store data, whether it's through writing in a piece of paper or storing in computer memory. This requires the use of data structures which, coupled with algorithms, can solve whatever problems that came up in our lives. Note that none of these data structures or algorithms require the use of a computer at all. Graphs could be drawn on a piece of paper, a stack of accounting bills on your desk is literally the stack data structure, and a long list of names and ages of citizens in a tablet can be considered an array (erasing someone off from a tablet would leave an empty space). These different forms of bookkeeping, through different data structures, have advantages and disadvantages.

When mechanical computers were invented in 1822 followed by the first electronic comptuers in 1943, we could rapidly scale these algorithms. Computing prime numbers did not have to be done by hand, and all the theory developed before then can be applied much more widely. But if you look at all the components in your computer, it's baffling to wonder how humanity could create such a complex and intricate object. I mean, certain smartphones that fit into your pocket is able to store the equivalent of 8 trillion on-off switches! This is why the next pillar deals with the construction of computers, which starts off with the most basic component: a transitor. This requires computer engineering and physics knowledge, but from here we slowly build upon abstractions to instruction sets and CPU design. The instruction set of a CPU is strongly tied to the architecture of a computer, and so we must introduce the Assembly language as a way to interact with the computer. Once we introduce the concept of memory (e.g. how different data types are stored), input/output devices, and databuses, this completes the von Neumann architecture of a computer. Finally, we talk about the rest of the computer, such as disks (how data persists even after the computer is turned off), GPUs, etc. Once we are finished with the computer engineering part, we now know how a computer is built. Finally, we talk about some low-level programs such as the firmware that is required to run before a computer turns on.

We have successfully built a computer, but we are not out of the construction pillar just yet. Just because we have a functional computer, it doesn't mean that it is very user-friendly. Therefore, an abstraction over the different architectures is the C programming language (plus C++ since I use it frequently). But to work with languages at all, we must need some way to turn the human-friendly language into machine code that the computer can process. This is the job of compilers, which constitutes another set of notes. Furthermore, computers are not monolithic objects that can operate one task at a time. There is a need to control multiple programs/processes and execute them concurrently, which is the job of the operating system (OS) and opens up a whole new can of worms on how to manage these processes (e.g. virtual memory, mutex locks, etc.). Note that the OS is simply another process, and in order to start the OS, the previously mentioned firmware must activate a bootloader that finds the correct operating system, and so bootloaders such as GRUB will be talked about here. Once the OS has started, this is a natural point to introduce shells (terminals) as a way to interact with the computer through the OS. Finally, we can introduce virtualization software like virtual machines (VMs) and containers (e.g. Docker) which isolate different processes as if they were running with their own operating system.

Great, now that we know how algorithms work and how to run them on computers, we can focus on the application of algorithms, our third pillar. I introduce the two most common languages in the world, Python and JavaScript, which are much higher level and used the most often when developing software or doing research. A powerful feature of the computer is the ability to connect to the world wide web, but to understand this we need to know how computers talk to each other, i.e. computer networking. On a closely related note, networking now allows multiple computers to work on one task, which is the topic of distributed systems. With the explosion of computing power and networks all over the world, the need for databases to store this exponentially growing data also motivates its own area of study.

On an individual level, there are several meta-programming tools that a developer runs into when working with code. First, you need a text editor, and I give a quick guide on NeoVim which is my go-to. Second, you need good version control, which is a systematic way to update code when working in a team, and git is universally used for this. Good version control almost always comes with automated processes and tests (usually run on Docker containers), known as continuous integration/development (CI/CD). You will see all of these concepts in (mostly) industry and (less in) research.

Finally I end with some miscellaneous applications such as blockchain and SLAM.

All of my personal notes are free to download, use, and distrbute under the Creative Commons "Attribution- NonCommercial-ShareAlike 4.0 International" license. Please contact me if you find any errors in my notes or have any further questions.

Data Structures and Algorithms [CS201, CS330]

  • Fundamental Operations: Addition, Subtraction, Multiplication, Division, Bit vs Value Runtime
  • Graphs: DFS, BFS, Shortest Path, Bipartite Graphs, Graph Colorings, Minimum Spanning Trees

Computability and Complexity Theory

  • Introduction: Computers, Internet Protocol, Networks, UDP/TCP, DNS/HTTPS, Binary Representation of Data
  • Information Theory: Shannon Entropy, Data Compression
  • Finite Computation: AON-CIRC Programs, Syntactic Sugar, NAND-CIRC Programs
  • Uniform Computation: Infinite Functions, Deterministic Finite Automata, Regular Expressions, Turing Machines, Turing Completeness, Cellular Automata, Lambda Calculus, Uncomputable Functions
  • Efficient Algorithms: Modeling Running Time, Time Hierarchy Theorem, Non-Uniform Computation, Polynomial Time Reductions, Traveling Salesman Problem, Graph Problems, 3SAT, NP Hard/Completeness

Advanced Algorithms

  • Graph Algorithms
  • Geometric Algorithms
  • Probabilistic Algorithms

Quantum_Computing [PHYS 627]

  • Quantum Mechanics: Bra-ket Notation, State Space, Unitary Evolution, General & Projective Measurements, POVM, Distinguishable Quantum States, Composite Systems as Tensor Product Spaces, Heisenberg Uncertainty Principle, Entanglement
  • Qubits & Quantum Circuits: Superposition, Born's Rule, Change of Computational Basis, Bloch Sphere, Global & Relative Phase, Logic Gates (Pauli matrices, Hadamard, CNOT, etc.), No-Cloning Theorem, Bell States, Teleportation
  • Quantum Algorithms: Reversible extensions of classical gates, Quantum Parallelism, Deutch-Jozsa Algorithm

Architecture

  • Transistors, Circuits
  • How are numbers stored?
  • Processor Design
  • Ram
  • Disk, how memory is stored when machine is powered off
  • Firmware
  • Von Neumann Architecture
  • GPUs
  • Drivers

C

C++

  • Basics: Variables, Expressions, Functions, Scope
  • Translation: Preprocessing, Directives, Compilation, Linking, Header Files, Namespaces
  • Types: Casting
  • Constant Expressions: Compiler Optimization, Constants, Compile-Time Programming, Constexpr

Operating Systems

  • Threads, Processes
  • Mutexes
  • Virtual Memory
  • Virtual Machines
  • Containers (Docker)
  • Shells and terminals

Python

  • Names and Values: Rebinding vs Mutating, Immutabale and Mutabale Types
  • Function Closures and Variable Scopes
  • Lists: Implementation, Similar Built-In Data Structures
  • Hashing: Implementation, Similar Built-In Data Structures
  • Iterators and Loops
  • Item Assignment with Walrus Opreator
  • Raising Exceptions
  • Decorators
  • Composing CLasses

Javascript

  • Document Object Model (DOM)
  • JavaScript Runtime Engine
  • Events and Event Handling
  • Asynchronous Handling: Callback functions, Promises, API Handling, Async/Await

Distributed Systems

  • High Performance Computing (SLURM)
  • Edge Computing
  • Internet of Things

Databases [CS316]

  • Relational Algebra:
  • Functional Dependencies:
  • Entity-Relation Diagrams:
  • SQL

Tools

  • Text Editors
  • Git
  • CI/CD

Blockchain

  • Cryptography & Encoding: Symmetric, Asymmmetric Encryption, SHA256 Hash Function, Digital Signatures, Elliptic Curve Cryptography, Base58Check, WIF Format
  • Cryptocurrency Wallets: Bitcoin Keys, Addresses, WIF-Compression, Hierarchical Deterministic Wallets, (Hardened) (Extended) Child Key Derivation, HD Wallet Key Path
  • Transaction Chains: UTXO, Fees, Orphan Transaction Pool
  • Standard Transactions: Locking/Unlocking Scripts, Script Language, P2PKH, P2SH, P2PK, MultiSig, OP_RETURN Scripts
  • Bitcoin Network: Full Nodes, SPV nodes, Bootstrapping, UTXO pools, networking, three-way handshake
  • Blockchain: Block Header, Linking Blocks, Merkle Trees & Paths
  • Mining: Coinbase transaction, Proof-of-Work, Difficulty Targeting, Network Block Validation, Blockchain Forks, Hash Rate, Mining Pools, Consensus Attacks

Simultaneous Localization and Mapping [CS394]

  • Hardware: Sensors (GPS, LiDAR), Cameras, IMUs, Lens, Wireless Connections, Sparse/Dense Maps
  • Image Processing: Feature Points, ORB Algorithm, Segmentation, Image Pyramids, Camera Parameterization, World/Camera Coordinates, Intrinsic/Extrinsic Matrix
  • Transformations: Lie Groups, Lie Algebras, SO(3), SE(3), Euler Angles, Quaternions, Exponential Map
  • Frontend: Visual Odometry, Keyframes, Mappoints, Epipolar Geometry, Triangulation, Essential/Fundamental Matrix, 8-Point Algorithm,
  • Backend: Filters, Extended Kalman Filter (EKF), Nonlinear Optimization, Graph Optimization, Bundle Adjustment, Sliding Window Filters, Pose Graphs, Sparse Optimization
  • Beyond: Loop Closing, EdgeSLAM, Multi-Agent Collaborative SLAM