CS346 Lecture Notes
File and Buffer Management Review,
Overview of PF and RM components


Paged files

Data on disk maps to pages of files:










  • Page (block) is unit of movement to/from disk
  • Pages of file usually contiguous on disk

    RedBase page size = 4096 bytes - 4 = 4092 bytes (4K-8K is typical)

    Buffer management

    Pages of disk files move in/out of in-memory buffer pool:
    
    
    
    
    
    
    
    
    
    
    
  • RedBase #pages in buffer = 40
  • RedBase total buffer size = .16 megabytes (i.e., tiny!)

    Buffer function

    Page pinning

    If a page is pinned (fixed) in buffer: Question: Why pin pages?
    (1)
    
    
    (2)
    
    
    (3)
    

    Buffer manager - basic functions

    1. Search for page in buffer (file name + page number)
      • sequential
      • table/list
      • hashing -- RedBase

    2. Replace some page in buffer
      Strategies:
        (a)
      
        (b)
      
        (c)
      
        (d)
      
        (e)
      
      

    3. Reserve (allocate) set of page slots
      • for certain operations
      • for certain types of pages (header, index, metadata)
      • for certain transactions

    Difference between DBMS buffer manager and OS virtual memory

    In DBMS: => Interaction between buffer and virtual memory can be troublesome.

    Relations and files

    Easiest: one relation per file

    Question: Why do something different?

    
    
    
    

    Records (tuples) on paged files

    Suppose records are: (RedBase)

    Simple record layout scheme:

    
    
    
    
    
    
    
    
    
    
    
    
    
  • Insert record: in first available slot
  • Delete record: update free-space information

    More complex for:

  • variable-length records
  • large records (bigger than a page)
  • sorted records

    Summary

  • files -- one per relation
  • pages -- buffer granularity; store fixed-length records



    RedBase Paged File (PF) component

    Read PF document carefully for detailed specs!

    General picture:

    
    
    
    
    
    
    
    
    
    
    
    
    
    
    1. Startup: create one instance of PF_Manager class

    2. File and scratch space routines:
           PF_Manager:: CreateFile
                        DestroyFile
                        OpenFile    -> initialized "file handle"
                        Close File
                        AllocateBlock  -> scratch memory page in buffer
                        DisposeBlock
      

    3. Page routines:
           PF_FileHandle:: GetFirstPage   -> initialized "page handle"
                           GetLastPage    ->           "
                           GetNextPage(n) ->           "
                           GetPrevPage(n) ->           "
                           GetThisPage(n) ->           "
                           AllocatePage   ->           "
                        // All page fetches pin page automatically
                           DisposePage(n)
                           MarkDirty(n)
                           UnpinPage(n)
                           ForcePages(n)
      
    4. Page Access:
           PF_PageHandle:: GetData
                           GetPageNum
      
    Notes:

    Read document carefully!!
    (return codes, error handling, statistics tracking, ...)

    Also read the RedBase Logistics document carefully
    (getting code, makefiles, ...)



    RedBase Record Management (RM) component

    General picture:

    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    File and page structure:
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    Managing free space

    Record identifier (RID) = (page number, slot number)

    1. Startup: RM_Manager class -- one instance, PF_Manager is parameter

    2. File routines:
           RM_Manager:: CreateFile(record size) - set up header
                        DestroyFile
                        OpenFile -> initialized file handle
                                    (copy header)
                        Close File
      
    3. Record routines:
           RM_FileHandle:: GetRec(RID)
                           InsertRec(data)
                           DeleteRec(RID)
                           UpdateRec(data,RID)
           Also ForcePages(n)
      
      Diagram of record handling:
      
      
      
      
      
      
      
      
      
      
    4. Record Access:
           RM_Record:: GetData
                       GetRid
      
    5. Record IDs:
           RID:: GetPageNum
                 GetSlotNum
      
    6. Scans:
           RM_FileScan:: OpenScan(file, attr, comp, value, pin-hint)
                      // File scan object is "scan handle", maintains state of scan
                         GetNextRec
                         CloseScan
      
      • Can suspend and resume scans at will.
      • Don't worry about modification scans concurrent with other scans.
    Read RM document carefully !
    (detailed specs, design suggestions, header files, return codes, error handling, documentation, ...)

    Also read the RedBase Logistics document carefully
    (testing, submission, grading)