/joh'liks/ n.,adj. 386BSD

Porting Unix to the 386: A Practical Approach



William & Lynne Jolitz


We initialized the processor with initial descriptor and page tables - one needs to run with the tables before activating memory/interrupt kernel functions.




Processor Support -- i386.c
Within the i386.c module appears the code and data structures needed to "wire-down" most of the 386's processor structures (descriptors, exceptions, task switch state). init386() is a subroutine that "fills in the blanks" and test386() tests portions of the mechanisms we will need to run our BSD UNIX system. Note that this creates a superficial test bed that does not entirely address our intended system, as user and kernel mode not only share address space, they are the same program!

Listing 9:
/* [excerpted from i386.c] */
 ...
#define   NBPG      4096         /* number of bytes per page */
#define   PG_V      0x00000001   /* mark this page as valid */
#define   PG_UW     0x00000006   /* user and supervisor writable */

int lcr0(), lcr3();
 ...
init386() {
   /* bag of bytes to put page table, page directory in */
   static char bag[(1+1+1)*NBPG];
   int *ppte, *pptd, *cr3, x;

   /* make page table & directory aligned to NBPG */
   ppte = (int *) (((int) bag + NBPG-1) & ~(NBPG-1));
   cr3 = pptd = ppte + 1024;

   /* page table directory only has lowest 4MB entry mapped */
   *pptd++ = (int) ppte + (PG_V|PG_UW);
   for (x = 1; x < 1024 ; x++,pptd++) *pptd =  0;

   /* page table, all entrys virtual == real, user/supervisor r/w */
   for (x = 0; x < 1024 ; x++,ppte++) *ppte =  x*NBPG + (PG_V|PG_UW) ;

   /* turn on paging */
   lcr3(cr3);
   printf("paging"); getchar();
   lcr0(0x80000001);
 ...

/* [excerpted from srt.s] */


   /* lcr3(cr3) */
   .globl   _lcr3
_lcr3:
   movl   4(%esp),%eax
   movl   %eax,%cr3
   ret

   /* lcr0(cr0) */
   .globl   _lcr0
_lcr0:
   movl   4(%esp),%eax
   movl   %eax,%cr0
   ret

We start first by initializing paging (Listing Nine). The next fragment contains code which enables paging by building a set of page tables and page directory. For this example, we map virtual addresses to correspond with physical addresses identically, and allow the first 4 Mbytes of physical memory to be referenced "read/write" by both user and supervisor (kernel) rings. It is important to remember that while the processor's instructions work through the paging MMU with virtual addresses, the addresses that the MMU uses to consult page directory and page tables are all physical addresses. These physical addresses do not always correspond to the virtual addresses that the processor uses, unlike this example where virtual addresses are mapped one for one. As a result, when modifying the page tables and page directory the kernel must explicitly convert any virtual addresses used to physical.

Another point to mention about this paging mechanism is that the page tables and page directory themselves need to be mapped to a given virtual address so that the kernel may modify them to change address translation on demand. An oddity of this paging mechanism is that it can work even if the page tables are completely inaccessible to the kernel in its virtual address space. This would be inconvenient for the kernel, however, as it spends a great deal of time modifying these structures already.

Two assembly language helper routines lcr0() and lcr3() allow us to set the 386's processor control and page directory base register, respectively. Since we are already running "protected," the lcr0() simply overwrites the already set protect-mode bit as well as the paging-mode bit, allowing the MMU to enter into paging mode.

Our page tables and directory as encoded here provide a null address mapping, so that there is, as yet, no effective difference in address translation. One might wonder why we must do this. If we don't, several subtle problems arise. For example, if the address mapping of the instructions we are executing were to differ, the 386's view of which instruction was to be executed next might no longer match the next assembled instruction the program should have executed. Both must be changed synchronously. Worse, if the 386 has an instruction queue fetching asynchronously, we may not be able to predict exactly when the transition occurred. The safest way to avoid these problems is to enable page mapping with no net translation, then modify the address mapping after the processor is running on the "identity" map. We can then arrange to flush our various processor instruction queues and MMU address translation buffers before allowing the processor to execute instructions in a "translated" portion of the address space.

Listing 8:
/* [excerpted from i386.c] */
 ...
/* Assemble a gate descriptor  */
setgate(gp, func, typ, dpl) char *func; struct gate_descriptor *gp; {
   gp->gd_looffset = (int)func;
   gp->gd_selector = GSEL(GCODE_SEL,SEL_KPL);
   gp->gd_stkcpy = 0;
   gp->gd_xx = 0;
   gp->gd_type = typ;
   gp->gd_dpl = dpl;
   gp->gd_p = 1;      /* definitely present */
   gp->gd_hioffset = ((int)func)>>16 ;
}

/* ASM entry points to exception/trap/interrupt entry stub code. */
#define   IDTVEC(name)   X##name
extern
   IDTVEC(div), IDTVEC(dbg), IDTVEC(nmi), IDTVEC(bpt), IDTVEC(ofl),
   IDTVEC(bnd), IDTVEC(ill), IDTVEC(dna), IDTVEC(dble), IDTVEC(fpusegm),
   IDTVEC(tss), IDTVEC(missing), IDTVEC(stk), IDTVEC(prot), IDTVEC(page),
   IDTVEC(rsvd), IDTVEC(fpu), IDTVEC(rsvd0), IDTVEC(rsvd1), IDTVEC(rsvd2),
   IDTVEC(rsvd3), IDTVEC(rsvd4), IDTVEC(rsvd5), IDTVEC(rsvd6),
   IDTVEC(rsvd7), IDTVEC(rsvd8), IDTVEC(rsvd9), IDTVEC(rsvd10),
   IDTVEC(rsvd11), IDTVEC(rsvd12), IDTVEC(rsvd13), IDTVEC(rsvd14),
   IDTVEC(rsvd14), IDTVEC(intr0), IDTVEC(intr1), IDTVEC(intr2),
   IDTVEC(intr3), IDTVEC(intr4), IDTVEC(intr5), IDTVEC(intr6),
   IDTVEC(intr7), IDTVEC(intr8), IDTVEC(intr9), IDTVEC(intr10),
   IDTVEC(intr11), IDTVEC(intr12), IDTVEC(intr13), IDTVEC(intr14),
   IDTVEC(intr15), IDTVEC(syscall);
init386() {
 ...
   /* exceptions */
   setgate(idt+0, &IDTVEC(div),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+1, &IDTVEC(dbg),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+2, &IDTVEC(nmi),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+3, &IDTVEC(bpt),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+4, &IDTVEC(ofl),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+5, &IDTVEC(bnd),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+6, &IDTVEC(ill),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+7, &IDTVEC(dna),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+8, &IDTVEC(dble),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+9, &IDTVEC(fpusegm),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+10, &IDTVEC(tss),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+11, &IDTVEC(missing),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+12, &IDTVEC(stk),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+13, &IDTVEC(prot),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+14, &IDTVEC(page),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+15, &IDTVEC(rsvd),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+16, &IDTVEC(fpu),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+17, &IDTVEC(rsvd0),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+18, &IDTVEC(rsvd1),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+19, &IDTVEC(rsvd2),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+20, &IDTVEC(rsvd3),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+21, &IDTVEC(rsvd4),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+22, &IDTVEC(rsvd5),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+23, &IDTVEC(rsvd6),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+24, &IDTVEC(rsvd7),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+25, &IDTVEC(rsvd8),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+26, &IDTVEC(rsvd9),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+27, &IDTVEC(rsvd10),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+28, &IDTVEC(rsvd11),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+29, &IDTVEC(rsvd12),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+30, &IDTVEC(rsvd13),  SDT_SYS386TGT, SEL_KPL);
   setgate(idt+31, &IDTVEC(rsvd14),  SDT_SYS386TGT, SEL_KPL);

   /* first icu */
   setgate(idt+32, &IDTVEC(intr0),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+33, &IDTVEC(intr1),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+34, &IDTVEC(intr2),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+35, &IDTVEC(intr3),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+36, &IDTVEC(intr4),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+37, &IDTVEC(intr5),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+38, &IDTVEC(intr6),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+39, &IDTVEC(intr7),  SDT_SYS386IGT, SEL_KPL);

   /* second icu */
   setgate(idt+40, &IDTVEC(intr8),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+41, &IDTVEC(intr9),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+42, &IDTVEC(intr10),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+43, &IDTVEC(intr11),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+44, &IDTVEC(intr12),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+45, &IDTVEC(intr13),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+46, &IDTVEC(intr14),  SDT_SYS386IGT, SEL_KPL);
   setgate(idt+47, &IDTVEC(intr15),  SDT_SYS386IGT, SEL_KPL);

   printf("lidt\n"); getchar();
   lidt(idt, sizeof(idt)-1);
 ...
 /* [excerpted from srt.s] */

   /* lidt(*idt, nidt) */
   .globl   _lidt
idesc:   .word   0
   .long   0
_lidt:
   movl   4(%esp),%eax
   movl   %eax,idesc+2
   movl   8(%esp),%eax
   movw   %ax,idesc
   lidt   idesc
   ret
Besides paging, we must reinitialize segmentation. We start by "flattening" the 386 with our descriptor tables. On the 386 (see Listing Six), our Global Descriptor Table (GDT) describes address space selectors that will have global visibility within our BSD kernel such that all processes will see them. Kernel address space requires a descriptor for instructions and data, as well as a task gate used to switch processes through, and various task state descriptors used to save and restore state on demand. The kernel has a "panic" task state reserved to be used when catching certain exceptions that require an "known good" task state.

For the address space selectors used in user processes, we have the Local Descriptor Table (LDT). We can use, potentially, one per process. These descriptors, as the name suggests, are private to each process, and describe the memory segments of that process. In addition, we have "gates." We need to use only one to call the system.

Listing 7:
/* segments.h: Copyright (c) 1989, 1990 William Jolitz. All rights reserved.
 * Written by William Jolitz 6/20/1989
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 * 386 Segmentation Data Structures and definitions
 */

/* Selectors  */
#define   ISPL(s)   ((s)&3)/* what is the priority level of a selector */
#define    SEL_KPL   0         /* kernel priority level */
#define    SEL_UPL   3         /* user priority level */
#define   ISLDT(s) ((s)&SEL_LDT)            /* is it local or global */
#define    SEL_LDT   4         /* local descriptor table */
#define   IDXSEL(s) (((s)>>3) & 0x1fff)     /* index of selector */
#define   LSEL(s,r) (((s)<<3) | SEL_LDT | r)/* a local selector */
#define   GSEL(s,r) (((s)<<3) | r)          /* a global selector */

/* Memory and System segment descriptors  */
struct   segment_descriptor   {
   unsigned sd_lolimit:16 ;    /* segment extent (lsb) */
   unsigned sd_lobase:24 ;     /* segment base address (lsb) */
   unsigned sd_type:5 ;        /* segment type */
   unsigned sd_dpl:2 ;         /* segment descriptor priority level */
   unsigned sd_p:1 ;           /* segment descriptor present */
   unsigned sd_hilimit:4 ;     /* segment extent (msb) */
   unsigned sd_xx:2 ;          /* unused */
   unsigned sd_def32:1 ;       /* default 32 vs 16 bit size */
   unsigned sd_gran:1 ;        /* limit granularity (byte/page units)*/
   unsigned sd_hibase:8 ;      /* segment base address  (msb) */
} ;

/* Gate descriptors (e.g. indirect descriptors)  */
struct   gate_descriptor   {
   unsigned gd_looffset:16 ;   /* gate offset (lsb) */
   unsigned gd_selector:16 ;   /* gate segment selector */
   unsigned gd_stkcpy:5 ;      /* number of stack wds to cpy */
   unsigned gd_xx:3 ;          /* unused */
   unsigned gd_type:5 ;        /* segment type */
   unsigned gd_dpl:2 ;         /* segment descriptor priority level */
   unsigned gd_p:1 ;           /* segment descriptor present */
   unsigned gd_hioffset:16 ;   /* gate offset (msb) */
} ;

/* Generic descriptor  */
union   descriptor   {
   struct   segment_descriptor sd;
   struct   gate_descriptor gd;
};
#define   d_type   gd.gd_type

   /* system segments and gate types */
#define   SDT_SYSNULL      0   /* system null */
#define   SDT_SYS286TSS    1   /* system 286 TSS available */
#define   SDT_SYSLDT       2   /* system local descriptor table */
#define   SDT_SYS286BSY    3   /* system 286 TSS busy */
#define   SDT_SYS286CGT    4   /* system 286 call gate */
#define   SDT_SYSTASKGT    5   /* system task gate */
#define   SDT_SYS286IGT    6   /* system 286 interrupt gate */
#define   SDT_SYS286TGT    7   /* system 286 trap gate */
#define   SDT_SYSNULL2     8   /* system null again */
#define   SDT_SYS386TSS    9   /* system 386 TSS available */
#define   SDT_SYSNULL3    10   /* system null again */
#define   SDT_SYS386BSY   11   /* system 386 TSS busy */
#define   SDT_SYS386CGT   12   /* system 386 call gate */
#define   SDT_SYSNULL4    13   /* system null again */
#define   SDT_SYS386IGT   14   /* system 386 interrupt gate */
#define   SDT_SYS386TGT   15   /* system 386 trap gate */

   /* memory segment types */
#define   SDT_MEMRO       16   /* memory read only */
#define   SDT_MEMROA      17   /* memory read only accessed */
#define   SDT_MEMRW       18   /* memory read write */
#define   SDT_MEMRWA      19   /* memory read write accessed */
#define   SDT_MEMROD      20   /* memory read only expand dwn limit */
#define   SDT_MEMRODA     21   /* memory read only expand dwn limit accessed */
#define   SDT_MEMRWD      22   /* memory read write expand dwn limit */
#define   SDT_MEMRWDA     23   /* memory r/w expand dwn limit acessed */
#define   SDT_MEME        24   /* memory execute only */
#define   SDT_MEMEA       25   /* memory execute only accessed */
#define   SDT_MEMER       26   /* memory execute read */
#define   SDT_MEMERA      27   /* memory execute read accessed */
#define   SDT_MEMEC       28   /* memory execute only conforming */
#define   SDT_MEMEAC      29   /* memory execute only accessed conforming */
#define   SDT_MEMERC      30   /* memory execute read conforming */
#define   SDT_MEMERAC     31   /* memory execute read accessed conforming */

/* is memory segment descriptor pointer ? */
#define ISMEMSDP(s)   ((s->d_type) >= SDT_MEMRO && (s->d_type) <= SDT_MEMERAC)

/* is 286 gate descriptor pointer ? */
#define IS286GDP(s)   (((s->d_type) >= SDT_SYS286CGT \
             && (s->d_type) < SDT_SYS286TGT))
/* is 386 gate descriptor pointer ? */
#define IS386GDP(s)   (((s->d_type) >= SDT_SYS386CGT \
            && (s->d_type) < SDT_SYS386TGT))
/* is gate descriptor pointer ? */
#define ISGDP(s)      (IS286GDP(s) || IS386GDP(s))

/* is segment descriptor pointer ? */
#define ISSDP(s)      (ISMEMSDP(s) || !ISGDP(s))

/* is system segment descriptor pointer ? */
#define ISSYSSDP(s)   (!ISMEMSDP(s) && !ISGDP(s))

/* Software definitions are in this convenient format; translated into
 * inconvenient segment descriptors when needed to be used by 386 hardware  */
struct   soft_segment_descriptor   {
   unsigned ssd_base ;      /* segment base address  */
   unsigned ssd_limit ;     /* segment extent */
   unsigned ssd_type:5 ;    /* segment type */
   unsigned ssd_dpl:2 ;     /* segment descriptor priority level */
   unsigned ssd_p:1 ;       /* segment descriptor present */
   unsigned ssd_xx:4 ;      /* unused */
   unsigned ssd_xx1:2 ;     /* unused */
   unsigned ssd_def32:1 ;   /* default 32 vs 16 bit size */
   unsigned ssd_gran:1 ;    /* limit granularity (byte/page units)*/
};

extern ssdtosd() ;          /* to decode a ssd */
extern sdtossd() ;          /* to encode a sd */

/* region descriptors, used to load gdt/idt tables before segments yet exist */
struct region_descriptor {
   unsigned rd_limit:16 ;   /* segment extent */
   char *rd_base;           /* base address  */
};

/* Segment Protection Exception code bits  */
#define   SEGEX_EXT   0x01  /* recursive or externally induced */
#define   SEGEX_IDT   0x02  /* interrupt descriptor table */
#define   SEGEX_TI    0x04  /* local descriptor table */
            /* other bits are affected descriptor index */
#define SEGEX_IDX(s) ((s)>>3)&0x1fff)
Descriptors come in many different flavors (see Listing Seven): Those that refer to memory or system data structures directly, and gates that indirectly refer to other memory segments. We use task gates to generically switch to the next consecutive task state, and call gates to allow us to enter the kernel's global code segment in a system call. Gates get their name from the controlled fashion that they regulate ring crossings, again from the MULTICS heritage.

Actually our coverage of descriptors is not yet complete. We have hidden descriptors as well that serve special functions. Interrupts and exceptions on the 386 index yet another descriptor table, the Interrupt Descriptor Table (IDT). No program code can call these gate descriptors. Instead, external interrupts and internal processor exceptions transfer through these gate descriptors. We also use a kind of "meta descriptor" called a "region descriptor," which is used to describe descriptor tables so that we can load them via appropriate instructions. So much for the cast of players in this descriptor drama.

Listing 6:
/* [excerpted from i386.c] */
 ...
/* Descriptor Tables  */

   /* Global Descriptor Table */
#define   GNULL_SEL      0   /* Null Descriptor - obligatory */
#define   GCODE_SEL      1   /* Kernel Code Descriptor */
#define   GDATA_SEL      2   /* Kernel Data Descriptor */
#define   GLDT_SEL       3   /* LDT - eventually one per process */
#define   GTGATE_SEL     4   /* Process task switch gate */
#define   GPANIC_SEL     5   /* Task state to consider panic from */
#define   GPROC0_SEL     6   /* Task state process slot zero and up */
union descriptor gdt[GPROC0_SEL+NPROC];

/* interrupt descriptor table */
struct gate_descriptor idt[NEXECPT+NINTR];

/* local descriptor table */
#define   LSYS5CALLS_SEL 0   /* SVID/BCS 386 system call gate */
#define   LSYS5SIGR_SEL  1   /* SVID/BCS 386 sigreturn() */
#define   LBSDCALLS_SEL  2   /* BSD experimental system calls */
#define   LUCODE_SEL     3   /* user process code descriptor */
#define   LUDATA_SEL     4   /* user process data descriptor */
union descriptor ldt[LUDATA_SEL+1];

/* Task State Structures (TSS) for hardware context switch */
struct   i386tss   tss[NPROC], ptss;

/* software prototypes -- in more palitable form */
struct soft_segment_descriptor gdt_segs[GPROC0_SEL+NPROC] = {
              /* Null Descriptor */
{  0x0,       /* segment base address  */
   0x0,       /* length - all address space */
   0,         /* segment type */
   0,         /* segment descriptor priority level */
   0,         /* segment descriptor present */
   0,0,
   0,         /* default 32 vs 16 bit size */
   0          /* limit granularity (byte/page units)*/ },
              /* Code Descriptor for kernel */
{  0x0,       /* segment base address  */
   0xfffff,   /* length - all address space */
   SDT_MEMERA,/* segment type */
   0,         /* segment descriptor priority level */
   1,         /* segment descriptor present */
   0,0,
   1,         /* default 32 vs 16 bit size */
   1          /* limit granularity (byte/page units)*/ },
              /* Data Descriptor for kernel */
{  0x0,       /* segment base address  */
   0xfffff,   /* length - all address space */
   SDT_MEMRWA,/* segment type */
   0,         /* segment descriptor priority level */
   1,         /* segment descriptor present */
   0,0,
   1,         /* default 32 vs 16 bit size */
   1          /* limit granularity (byte/page units)*/ },
              /* LDT Descriptor */
{  (int) ldt, /* segment base address  */
   sizeof(ldt)-1,/* length - all address space */
   SDT_SYSLDT,/* segment type */
   0,         /* segment descriptor priority level */
   1,         /* segment descriptor present */
   0,0,
   0,         /* unused - default 32 vs 16 bit size */
   0          /* limit granularity (byte/page units)*/ },
              /* Null Descriptor - Placeholder */
{  0x0,       /* segment base address  */
   0x0,       /* length - all address space */
   0,         /* segment type */
   0,         /* segment descriptor priority level */
   0,         /* segment descriptor present */
   0,0,
   0,         /* default 32 vs 16 bit size */
   0           /* limit granularity (byte/page units)*/ },
               /* Panic Tss Descriptor */
{  (int) &ptss,/* segment base address  */
   sizeof(tss)-1,  /* length - all address space */
   SDT_SYS386TSS,  /* segment type */
   0,         /* segment descriptor priority level */
   1,         /* segment descriptor present */
   0,0,
   0,         /* unused - default 32 vs 16 bit size */
   0          /* limit granularity (byte/page units)*/ },
              /* Process 0 Tss Descriptor */
{  (int) &tss[0],/* segment base address  */
   sizeof(tss)-1,    /* length - all address space */
   SDT_SYS386TSS,    /* segment type */
   0,         /* segment descriptor priority level */
   1,         /* segment descriptor present */
   0,0,
   0,         /* unused - default 32 vs 16 bit size */
   0          /* limit granularity (byte/page units)*/ } };

struct soft_segment_descriptor ldt_segs[] = {
              /* Null Descriptor - overwritten by call gate */
{  0x0,       /* segment base address  */
   0x0,       /* length - all address space */
   0,         /* segment type */
   0,         /* segment descriptor priority level */
   0,         /* segment descriptor present */
   0,0,
   0,         /* default 32 vs 16 bit size */
   0          /* limit granularity (byte/page units)*/ },
              /* Null Descriptor - overwritten by call gate */
{  0x0,       /* segment base address  */
   0x0,       /* length - all address space */
   0,         /* segment type */
   0,         /* segment descriptor priority level */
   0,         /* segment descriptor present */
   0,0,
   0,         /* default 32 vs 16 bit size */
   0          /* limit granularity (byte/page units)*/ },
              /* Null Descriptor - overwritten by call gate */
{  0x0,       /* segment base address  */
   0x0,       /* length - all address space */
   0,         /* segment type */
   0,         /* segment descriptor priority level */
   0,         /* segment descriptor present */
   0,0,
   0,         /* default 32 vs 16 bit size */
   0          /* limit granularity (byte/page units)*/ },
              /* Code Descriptor for user */
{  0x0,       /* segment base address  */
   0xfffff,   /* length - all address space */
   SDT_MEMERA,/* segment type */
   SEL_UPL,   /* segment descriptor priority level */
   1,         /* segment descriptor present */
   0,0,
   1,         /* default 32 vs 16 bit size */
   1          /* limit granularity (byte/page units)*/ },
              /* Data Descriptor for user */
{  0x0,       /* segment base address  */
   0xfffff,   /* length - all address space */
   SDT_MEMRWA,/* segment type */
   SEL_UPL,   /* segment descriptor priority level */
   1,         /* segment descriptor present */
   0,0,
   1,         /* default 32 vs 16 bit size */
   1          /* limit granularity (byte/page units)*/ } };
 ...
extern ssdtosd(), lgdt(), lidt(), lldt(), usercode(), touser();

init386() {
 ...
   /* make gdt memory segments */
   for (x=0; x < sizeof gdt / sizeof gdt[0] ; x++)
         ssdtosd(gdt_segs+x, gdt+x);
   printf("lgdt\n"); getchar();
   lgdt(gdt, sizeof(gdt)-1);

   /* make ldt memory segments */
   for (x=0; x < sizeof ldt / sizeof ldt[0] ; x++)
      ssdtosd(ldt_segs+x, ldt+x);

   /* make a call gate to reenter kernel with */
   setgate(&ldt[LSYS5CALLS_SEL].gd, &IDTVEC(syscall), SDT_SYS386CGT,
      SEL_UPL);
   printf("lldt\n"); getchar();
   lldt(GSEL(GLDT_SEL, SEL_KPL));
 ...
/* [excerpted from srt.s] */
 ...
   /* lgdt(*gdt, ngdt) */
   .globl   _lgdt
gdesc:   .word 0
   .long 0
_lgdt:
   movl   4(%esp),%eax
   movl   %eax,gdesc+2
   movl   8(%esp),%eax
   movw   %ax,gdesc
   lgdt   gdesc
   jmp   1f           /* flush instruction prefetch q */
   nop
1: movw   $0x10,%ax   /* reload other "well known" descriptors */
   movw   %ax,%ds
   movw   %ax,%es
   movw   %ax,%ss
   movl   0(%esp),%eax
   pushl   %eax
   movl   $8,4(%esp)   /* including the ever popular CS */
   lret
 ...
   /* lldt(sel) */
   .globl   _lldt
_lldt:
   lldt   4(%esp)
   ret
 ...
Because the actual descriptor encoding is somewhat obscure (it was meant to be reverse-compatible with the 286), we chose to refer to the descriptor by having a subroutine shuffle our software descriptors into appropriate form when presented to the hardware for use. In Listing Six local and global tables are filled out by translating them into hardware form and loading them with a lgdt(), lidt() function. We do this, even through we are already in protected mode, to provide this newer version of the descriptor tables that we wish to use. The function lgdt() hides some characteristics of the 386 segmentation from view, because when we reload the GDT (we are running using active GDT descriptors), we need to flush instruction prefetch and reload all kernel descriptors. This insures proper code execution. We then reload the CS register by turning the normal intrasegment return into a intersegment return.

In the case of our IDT table, we use a subroutine, setgate() (see Listing Eight) to build interrupt gate descriptors that will enter the system's global code descriptor at special assembler stub routines. Each is referred to by a special naming convention hidden by the IDTVEC() macro that catches the exception or interrupt. With all of these descriptor tables loaded and in place, the 386 now has complete information describing the legitimate references to RAM memory by user programs, the operating system kernel, and hardware-accessed data structures. Exceptions, including incorrect references to memory, will also be caught and directed to appropriate code.

One virtue of this complicated scheme of descriptors and segments is that it is possible to add new microprocessor features by simply adding new descriptor types. The mechanism is now general enough to support a wide variety of data objects in a consistent way.

On top of the standalone system framework (which really requires very little processor-dependent support) we can write and test portions of code for the operating system kernel (which requires quite a lot of processor-dependent support).

In the following sections of this article, we will discuss some extensions to the standalone system which add kernel functionality. Processor support for the kernel reflects support for memory protection of 386 "rings," ring crossings, and address space translations among other needs (see the accompanying box "Brief Notes: 386 Rings") in "Extending the Standalone System".

These extensions are not required for the standalone system to function, but they are not only used to test the kernel code, but actually form the basis for the prototype kernel code. In essence, the standalone system can be viewed as if it were the kernel itself, or possibly even a nano-kernel!


<<BACK NEXT >>



Copyright 1989, 1990, 2006 TeleMuse Partners, William Jolitz and Lynne Jolitz