CS Fundamentals
- Resource Manager
- API for apps developers(It gives set of methods(system calls) that can be made by applications to get work done from the resources
- UniProgramming:
- Only one program can run at a time - ATM
- Embedded systems
- Multi Programming
- More than one program at a time
- Multi-tasking systems is same as multi program systems
- Unix: MacOS, Linux(programs)
- Windows: Tasks
- 2 types of multi program categorization
- non-preemtive
- supports mutli-programming
- but only one program runs at a time
- you cannot switch to another program until the current one completes
- Preemptive - where a task can be paused/stopped in the middle of execution
- CPU vs GPU
- CPU is one intelligent person with speed in GHz
- GPU is 100 not so intelligent persons with speed 600 MHz
- Program - set of instructions to run. It is stored on disk
- Process - Program that is running. It is stored on RAM and run by CPU
- Process Control Block (PCB)
- Info about all the process running
- id
- resources used(memory, printer..)
- CPU related info(Priority)
- Memory limits
- state
CPU Scheduling
- 2 types of processes
- CPU bound processes(more calculation work)
- Video processing, Image processing
- I/O bound processes(more of user interaction)
- Interrupt:
- It is signal to OS that the current program needs to pause for something else like IO/Network call/ read from disk.
- When interrupt is raised, CPU stores current value of PCB in memory, and allocates some other process to cpu processing
- Another Interrupt is raised once the IO activity is done. When cpu decides to process, it loads PCB to CPU cache and process
- Life cycle
- Start a Program -> Ready Queue(List<PCB> -> Scheduler decides to allocate to CPU -> IO Request -> IO Queue -> IO complete -> Ready Queue
- Context Switch
- Moving a process out of a register to memory as well as moving it back to register takes time. This time is known as context switch
- CPU scheduling algorithms
- FCFS
- non-preemptive algorithm(cannot switched to another task until current is complete)
- SRTF
- Shortest remaining time first(use minHeap)
- Pre-emptive algorithm
- Decides the process to run when ever
- A previous process is complete OR
- a new process comes
- How to determine burst/competion time
- Avg of previous running time
- size or program
- SJF: Shortest job first
- non-preemptive algorithm
- when previous process completes- pick up the shortest job in the queue
- Starvation in CPU scheduling
- Waiting for resources even when the new processes come and get the resources
- SRTF has starvation problem
- Overcome starvation limitation with Priority Scheduling
- Lower priority process might have to wait too much to get CPU
- Every time a process is denied a resource, we increase the priority by one
- Round robin CPU scheduling
- Threads
- Multi core vs single core
- Concurrency vs Parallelism
- Print 1..10 via separate threads
CPU uses multiple queues and multiple scheduling algorithms. Round robin happens across queues.
- High priority task queue(priority scheduler)
- IO intensive task queues(SRTF)
- Low priority task queues(FCFS)
- System task queues(RR)
Threads
- Basic unit of CPU execution
Process
- Single Thread: It has Code, data, files opened, Counters(last line executed in code), function stack & Thread that has access to all the above data
- Multi Thread:
- Code, data, files are common for all threads.
- Counter, stack are maintained separately at thread level
Process Vs Thread
- Data cannot be shared across processes
- Thread is a light weight process
Concurrency vs Parallelism
- Concurrent systems: More than one task at diff stage of execution but not necessarily making progress at the same time
- Parallel systems: Multiple tasks making progress at the same time
- Parallel system is always concurrent but not vice versa
- How to print(execute task) a statement in separate thread
- create a class for that task
- Implement Runnable interface OR extend Thread class a
- create run() in class
- Create Thread class and pass the object of class created for new task
- Print 100 numbers but in each one in different thread
- Executor Frameworks
- Thread Pools
- Callables
Operating Systems - 4 Synchronization and Memory Management
- Intro to Synchronization problems
- Problem: Two process working on same shared variable. One task adds number 1..100 and other substracts 1..100.
- when does it happen
- Critical section: when multiple processes working on shared data
- Race Condition: More than one thread entering critical section at same time
- preemption with race condition(non preemptive means sequential processing)
- Mutual Exclusion
- Only one thread can enter critical section at a time
- Progress
- Overall system should keep running and making progress
- Bounded waiting
- waiting should not be indefinite
- No Busy waiting(CPU should not wait doing useless work ie if I have critical section now)
- Properties of good sync problem solution
- Mutex/Lock
- Take lock before entering critical section
- Group of interconnected items
- Network of interconnected computers
- Advantages
- Sharing of resources
- communication
- Network interface adapter/card(chip)
- Every NIA/NIC have a mac address (Media access controller)
- ip address
- address on network
- ISP gives IP address to router
- Router give ip address to Other devices connected to router
- Tower gives the ip address to mobile internet
- MAC address for a adapter(unique id like Adhaar card)
- one mac for every adapter
- MAC is different for wifi adapter vs LAN adapter
- https://www.cleancss.com/mac-lookup/index.php
- Layering Architecture(ISO model)
- application(software layer) -> gets input and forwards to next layer
- http/ftp/
- presentation(software layer) -> optimizations like encrpt/encode/compress
- session -> manages user session
- transport -> how to send it. If tcp ..checks for correctness and guarantees delivery
- tcp(transmission control protocol)
- udp(user datagram protocol)
- skype/meet/zoom
- network(i/p) - facilitates end-end communication
- decides the route
- data link -> manages communication between 2 directly connected hosts
- hop to hop connection
- laptop to router to router
- error detection and resend as needed
- physical layer
- TCP /ip
- appln/presentation/session are combined into application layer
- Physical layer is removed as it cares only software stack
- Application level protocols
- http
- DNS
- Transport layer protocol
- TCP
- UDP
- Sockets vs Ports
- client-server vs peer-peer architecture
- rphermal ports
- Headers
- It is Object to talk to network //new Socket(ip, port);
- Client server protocol
- Uses TCP as transport layer protocol
- Stateless
- Types of http connections
- Non-persistent
- New TCP connection is established on every http request
- Persistent
- single TCP connection for all the requests
- DNS
- DNS is a hierarchical system
- DNS Root servers -> maintained by well established organizations/isps
- 13 such root servers exist(https://www.iana.org/domains/root/servers)
- TLD name servers(top level domain name server)(.com, .in, .co, .dev..)
- Authoritative name server
- At every level there is cache
- DNS Recursive resolver
- Asks root server to resolve the name.
- if DNS has the ip then it returns ELSE return TLD server address
- if TLD has the ip then it returns ELSE return Authoritative name server address
- Authoritative name server will return the ip address
- traceroute google.com
- host -t ns oracle.com
- Headers: Information needed by protocol to perform their responsibilities
- TCP
- Transmission control protocol
- reliable
- checks for errors
- connection oriented(connection is created and data is sent on using connection/path)
- larger header size
- UDP
- User datagram protocol
- it is best effort
- basic error checking
- connection less
- small header
- How TCP works
- 3 way handshake
- client sends SYN 1521 message to server
- Server
- Create a Socket
- Bind to a port
- starts listening to connect
- Server accepts a connection
- Receive and send data
- Client
- Create a socket(specify ip address family & connection type)
- connect to a server by specifying host and port
- Send data
- Receive data
- Mutex/lock: only one thread goes into critical section
- Semaphores
- Concurrent data structures
- Semaphore s=new Semaphore(5); // up to 5 threads can acquire lock
s.acquire();
s.release(); - Homework
- https://leetcode.com/problems/traffic-light-controlled-intersection
- https://leetcode.com/problems/print-foobar-alternately
- https://leetcode.com/problems/print-zero-even-odd
- https://leetcode.com/problems/the-dining-philosophers
- Automic data types
- Automic integer
- Concurrent Hashmap
- Divides keys into buckets
- lock is acquired on bucket instead of complete map
- If you have a conc env, consider using concurrent data structures instead of normal data structures
- Deadlocks(Concurrency)
- Biggest problem is Synchronization for which solution is locking. However locking can lead to dead lock
- Dead lock is a state where progress of system is stalled because processes are waiting for resources that are with each other
- Solution for deadlocks
- All threads should take locks in defined order eg: ascending order
- Deadlock recovery algorithm
- How to detect deadlock
- If the bidirectional graph of client and resources is cyclic then deadlock has happened
- How to resolve
- revert/kill one of the processes till deadlock is resolved
- Deadlock Ignorance
- In most real world OS, ignore deadlock. User will restart system.
- Timeout (3 or 4 secs). Kill the process if timesout
- if tryLock fails, then either sleep or release all previous locks and ask user to retry
- Memory Management
- CPU connects/interacts with registers, caches, memory(RAM)
- generally registers are of size 4MB, cache 20MB and memory 10GB, Storage/disk is of size is in TB
- Applications/programs reside in disk
- Process resides in memory
- Out of memory - when RAM is not enough to load process/program
- How are processes stored in RAM
- Store processes contiguously in memory/Every process to have part of memory
- Internal Fragmentation: Non utilized space from what has been allocated to a process
- External Fragmentation: Non utilized space between 2 processes
- Paging
- System maintains Page table for all processes. It has details of where the data is stored
- Page table internally refers to memory management which actually has details of location of data stored
- Pages are data stored in RAM, data stored in CPU are called Frames
- Two types of addresses
- Logical & Physical addresses
- Page table refers to logical address & memory management stores physical address. Physical address can be either memory or storage
- CPU takes logical address in page table and asks Memory management unit to give real address
- When data is stored in disk, memory management loads into ram and gives address to CPU
- Page Replacement algorithms
- which page to remove from memory and replace from disk
- FIFO, LIFO, LRU
- Cons of Paging
- relative slowness
- Pros of paging
- MMU can make more efficient use of resources
- More applns than supported by RAM
- Security - Applns doesn't know real data addresses
- Page Fault
- Appln tries to get data at logical address. MMU tries to see if data is present in RAM frames. If not found, page fault happens. In such case, MMU have to get the data from storage which will have overhead.
- Thrashing
- Excessive page faults
- It is bad as CPU is wasting time getting data from disk rather than doing work
- If increasing #pages, will it reduce page faults
- Might not, rather it might increase page faults
- This is called Balady Anamoly
Comments
Post a Comment