Generally
in a computer, Data is stored on the disc in the form of zeros and ones (1, 0).
The disc is either made with non-magnetic material, which is coated with a thin
layer of magnetic material (CD/DVD’s) or solid-state drives (Memory Cards/HDD).
In the magnetic disks data is stored by magnetizing the thin film and spinning
the disk. And in the solid-state drives data is stored eclectically. BIT is a basic storage area on the disc. A
bit can store either zero or one (0 or 1). If that area (or bit) electrically
ON then it is treated as 1 else treated as 0. Each and every character from the
computer keyboard internally converted to ASCII code, and next to machine
understandable (binary) language i.e. 1's and 0's. Operating system is a set of
programs, that can convert High-level language (Human understandable language
e.g. English, C) to Low-level language (Machine understandable language i.e.
binary) and vice versa.
For example,
8 bits
=
1 byte
1024 bytes
= 1 KB (Kilo Bytes)
1024
KB’s = 1 MB (Mega Byte)
1024
MB’s = 1 GB (Giga Byte)
1024
GB’s = 1 TB (Terabytes)
1024
TB’s = 1 PB (Peta byte)
1024
PB’s = 1 EB (Exabyte)
Data
in SQL Server is arranged in the form of Pages and Extents.
A
Page is basic storage area in SQL Server, having 8 KB size on the disc, in that
96 bytes allocated to page header, and which contains information about the
page like page number, page type, the amount of free space on the page, and the
allocation unit ID of the object that owns the page etc.
A
table is a collection of Rows and Columns. When you insert Rows (logical sequence) in to a table those
rows internally spread over to multiple pages (physical
sequence)
on the disk. A row stores all respective columns either by value or by
reference. One or more rows are sequentially packed into a page, once the page
is full continues to next page. Like this as many as possible rows are fit into
a page. When a row needs to be accessed, the respective page copied into buffer
memory. Here the deviation between logical and physical sequence is treated as 'External
Fragmentation'. And the percentage of empty space within a page treated
as 'Internal
Fragmentation' of that page.
Extents
are a collection of eight physically contiguous pages and are used to
efficiently manage the pages. All pages are stored in extents.
INDEX
- When there are thousands of records in a table, retrieving data will take a long time. By using INDEX we can increase the performance of retrieving data from database.
- Indexes speed up the querying process by providing fast access to rows in the data tables, similarly to the way a book’s index helps you find information quickly within that book.
- Therefore indexes are created on columns which are accessed frequently, so that the information can be retrieved quickly.
- Indexes can be created on a single column or a group of columns.
HEAP:
A table that does not have an Index is called a HEAP.
Indexes
are introduced in SQL Server to find or modify requested data in minimum amount
of time using few system resources to achieve max performance. In general SQL
Server has only two possible ways to access the requested data from the
database.
- It can scan each and every row from the beginning to end, if it meet the requested criteria then it fetch the requested data.
- Or if an index available, it uses that index to locate the requested data.
Indexes
took the space on the disk and must be kept in sync with tables. SQL Server indexes
are implemented a type of B-TREE structure. The B-TREE architecture consist
ROOT level, NON-LEAF level and LEAF level.
- Every Index has a single ROOT level on top used as starting point for traversing an index.
- All the index levels above the LEAF level including the ROOTs known as the NON-LEAF levels.
- The LEAF level is the bottom level of the index structure it contains Key-Value index entries that either reference the rows in the data pages or complete data rows.
SQL
Server supports two types of Indexes, Clustered and non-Clustered. Both are
having the similar B-Tree structure.
Clustered index: Here the LEAF level
has the actual data pages from the table sorted in ascending order. The order
of the values in the index pages also in ascending. There can be only one
clustered index per table, and might be on one or more columns. SQL Server
internally maintains uniqueness of key values for each row even if the column
data is not unique. For example in a Clustered Index column data inserted ‘RADHA’
as first instance, here all key values are unique. When subsequent values of
‘RADHA’ are inserted again SQL Server generates an internal number to maintain uniqueness
of key-value pair.
Non clustered index: Non clustered index
has the same B-Tree structure as clustered index with a difference. The LEAF
level of Non-Clustered index contains key values instead of actual data. These
key-values are pointed to Clustered Index keys (if that table has the Clustered
Index) or the rows in the data pages.
The
implementation a Non-Clustered index depends on whether the data pages of a
table as a HEAP, or as a Clustered Index. It means that if the table does not
have a Clustered Index the LEAF level pointed to rows in the data pages. If the
table has a Clustered Index SQL Server builds the Non-Clustered Index on top on
Clustered Index so that Non-Clustered LEAF level pages use the Clustered Index.
(Continue
reading on next post…)