DBMS Interview Questions & Answers

1. What are data and information, and how are they related in a database?

Data is recorded facts and figures, and information is knowledge derived from data. A database stores data in such a way that information can be created.

2. What is Enterprise Resource Planning (ERP), and what kind of a database is used in an ERP application?

Enterprise Resource Planning (ERP) is an information system used in manufacturing companies and includes sales, inventory, production planning, purchasing and other business functions. An ERP system typically uses a multiuser database.

3. What is a DBMS?

DBMS stands for Database Management System. A DBMS receives requests from applications and translates those requests into actions on a specific database. A DBMS processes SQL statements or uses other functionality to create process and administer databases.

4. Why is a database considered to be “self-describing”?

In addition to the users’ data, a database contains a description of its own structure. This descriptive data is called “metadata.”

5. Who is E.F. Codd, and why is he significant in the development of modern database systems?

While working at IBM, E.F. Codd created the relational database model. A paper he published in 1970 presented his ideas to the world at large. His work is the foundation for most of the DBMSs currently in use, and thus forms the basis for database systems as we know and use them today.

6. What is SQL, and why is it important?

SQL stands for Structured Query Language, and is the most important data processing language in use today. It is not a complete programming language like Java or C#, but a data sublanguage used for creating and processing database data and metadata. All DBMS products today use SQL.

Write an SQL SELECT statement to display all the columns of the STUDENT table but only those rows where the Grade column is greater than or equal to 90.

SELECT * FROM STUDENT WHERE Grade >= 90;

7. What is the difference between SQL Server 2000 clustered and nonclustered indexes?

With a clustered index, the data are stored in the bottom level of the index and in the same order as that index. With a nonclustered index, the bottom level of an index does not contain data; it contains pointers to the data. For data retrieval, clustered indexes are faster than nonclustered indexes.

8. Name and briefly describe the five SQL built-in functions.

COUNT: computes the number of rows in a table. SUM: totals numeric columns. AVG: computes the average value. MAX: obtains the maximum value of a column in a table. MIN: obtains the minimum value of a column in a table.

9. Write an SQL SELECT statement to count the number of rows in STUDENT table and display the result with the label NumStudents.

SELECT COUNT(*) AS NumStudents FROM STUDENT;

10. What is an SQL subquery?

An SQL subquery is a means of querying two or more tables at the same time. The subquery itself is an SQL SELECT statement contained within the WHERE clause of another SQL SELECT statement, and separated by being enclosed in parenthesis. Some subqueries have equivalent SQL join structures, but correlated subqueries cannot be duplicated by a join..

11. Discuss the alternative terminology that is used in the relational model.

Relations are also called tables, and sometimes by the older data processing term files. A row is known as a tuple in the relational model, but may also be referred to as a record. Finally, relational model attributes are known as table columns and sometimes as fields.

12. Why are functional dependencies not equations?

Equations deal with numerical relationships. A functional dependency deals with the existence of a determinant relationship between attributes, regardless of whether or not there is a numerical relationship between them. Thus, if we know that there is no hot water every Wednesday, No-Hot-Water is functionally dependent on Wednesday. So, if we know it is Wednesday, then we know we will have No-Hot-Water. This is a functional dependency, but not an equation.

13. What is a foreign key, and what is it used for?

A foreign key is used to establish relationships among relations in the relational model. Technically, a foreign key is a column (or columns) appearing in one relation that is (are) the primary key of another table. Although there may be exceptions, the values in the foreign key columns usually must correspond to values existing in the set of primary key values. This correspondence requirement is created in a database using a referential integrity constraint on the foreign key.

14. What are insertion and deletion anomalies?

A deletion anomaly occurs when, by deleting the facts about one entity, we inadvertently delete facts about another entity; with one deletion, we lose facts about two entities. For example, if we delete the tuple for Student 001289 from a table, we may lose not only the fact that Student 001289 is in Pierce Hall, but also the fact that he has $200 left in his security deposit. An insertion anomaly happens when we encounter the restriction that we cannot insert a fact about one entity until we have an additional fact about another entity. For example, we want to store the fact that the security deposit for Pierce Hall is $300, but we cannot enter this data into the Student relation until a student registers for Pierce Hall.

15. What does it mean when we say that a relation is in Boyce-Codd Normal Form (BCNF)?

A relation is in BCNF when every determinant in the relation is a candidate key. This means that any possible primary key can determine all other attributes in the relation. Attributes may not be determined by non- candidate key attributes or part of a composite candidate key. Thus it is said ‘I swear to construct my tables so that all nonkey columns are dependent on the key, the whole key and nothing but the key, so help me Cod d”

16. You have been given a set of tables with data and asked to create a new database to store them. When you examine the data values in the tables, what are you looking for?

(1) Multivalued dependencies, (2) Functional dependencies, (3) Candidate keys, (4) Primary keys and (5) Foreign keys.

17. Why do normalized tables require more complex SQL when SQL statements are used in application programs’

Tables that are normalized contain data that has been distributed among the tables, but which may need to be recombined to answer queries from an application. To recombine the data, the programmer will have to use subqueries and/or joins. These SQL structures are more complex to write than a simple SELECT statement.

18. What is the multivalue, multicolumn problem?

Include an example not used in the text. The multivalue, multicolumn problem occurs when a table is designed to include multiple columns that hold variations of one type of attribute data. One example is where boat owners have the names of their boats stored as BOAT_Ol, BOAT_02 and BOAT 03.

19. Why is the multivalue, multicolumn problem another form of the multivalued dependency problem?

Both problems try to store multiple values on an attribute in a table. In the multivalue, multiplecolumn problem, the multiple values are stored in different columns. In the multivalued dependency problem, the multiple values are stored in different rows. In both cases, the solution is the same: store the multiple values in a separate table.

20. What is the inconsistent values problem? Include an example not used in the text.

The inconsistent values problem occurs when different users or data sources use slightly different forms of the same data value. One example is where automobiles are specified as “Ford, 2-door, Red” in one cell and “Red Ford 2-door’ in another.

21. Explain the relationship between entity, entity class, and entity instance.

An entity is something that can be identified in the users’ work environment, something that the users want to track. Entities of a given type are grouped into entity classes. An entity instance is the representation of a particular entity.

22. Explain the difference between attributes and identifiers.

Entities have attributes. Attributes are properties that describe the entity’s characteristics. Entity instances have identifiers. Identifiers are attributes that name, or identify, entity instances.

23. Name and describe three types of binary relationships.

1:1 - a single entity instance of one type is related to a single-entity instance of another type.

1:N - a single entity instance of one type is related to many-entity instances of another type. M:N - many-entity instances of one type relate to many-entity instances of another type.

24. What is the archtetype/instance pattern?

The archetype/instance pattern occurs when one entity tracks occurrences of another entity. A common example is the relationship between CLASS and SECTION, where the actual occurrence of a class happens when students register for a SECTION of that CLASS. The archetype/instance pattern is implemented as an ID-dependent relationship.

25. What is a recursive relationship? Give an example not used in the text.

A recursive relationship is a relationship between an entity and itself. For example, given the entity PERSON, a recursive relationship could be used to show a PERSON and his or her SIBLINGs (brothers and sisters).

26. What are the steps for transforming an entity into a table?

The steps are: (1) specify the primary key, (2) specify candidate keys, (3) specify column properties including null status, data type, default value (if any), and data constraints (if any), and (4) verifying normalization.

27. Define a surrogate key, describe the ideal primary key and explain how surrogate keys meet this ideal.

The ideal primary key is short, numeric and fixed. A surrogate key is a unique, DBMS-supplied identifier intended to be used as the primary key of a table. Further, the DBMS will not allow the value of a surrogate key to be changed. The values of a surrogate key have no meaning to the users and are usually hidden on forms and reports. By design, they are short, numeric and fixed and thus meet the definition of the ideal primary key

28. Define and discuss data constraints.

Data constraints on a column are the limits put on the values the data can have. There are four types of data constraints: (1) domain constraints, which define a limited set of values for the column, (2) range constraints, which specify that the values must fall within a certain range, (3) intrarelation constraints, which define what values the column can have based on values of other columns in the same table, and (4) interrelation constraints, which define values the column can have based on values of columns in other tables.

29. In general, how are recursive relationships handled in a database design?

A recursive relationship is a relationship among entities of the same class, and is represented in the same way as other relationships are. The rows of the tables can take two different roles, however. Some are parent rows, and others are child rows. Further, the table will contain both its own primary key and the foreign key that links back to the table itself. If a row has no parent, then the value of the foreign key column in that row will be null. If the row has a parent, then there must be a foreign key value in that row that corresponds to the primary key value of another row in the table.

30. What is a cascading update?

Referential integrity constraints require that foreign key values in one table correspond to primary key values in another. If the value of the primary key is changed -- that is, updated -- the value of the foreign key must immediately be changed to match it. Cascading updates will set this change to be done automatically by the DBMS whenever necessary.

31. What is a SQL view? Briefly explain the use of views.

A SQL view is a virtual table built from other tables or views. Views are used to (1) hide columns or rows, (2) show the results of computed columns, (3) hide complicated SQL syntax, (4) layer built-in functions, (5) provide a level of indirection between application programs and tables, (6) assign different sets of processing permissions to tables, and (7) to assign different sets of triggers to the same table.

32. Explain the “paradigm mismatch” between SQL and application programming languages.

SQL statements return a set of rows, while an application program works on one row at a time. To resolve this mismatch the results of SQL statements are processed as pseudofiles, using a cursor or pointer to specify which row is being processed.

33. Name four applications for triggers.

(1) providing default values, (2) enforcing data constraints, (3) updating views and (4) enforcing referential integrity

34. What are stored procedures, and how do they differ from triggers?

A stored procedure is a program that is stored within the database and is compiled when used. They can receive input parameters and they can return results. Unlike triggers, their scope is database-wide; they can be used by any process that has permission to use the database stored procedure.

35. What are the advantages of using stored procedures?

The advantages of stored procedures are (1) greater security, (2) decreased network traffic, (3) the fact that SQL can be optimized and (4) code sharing which leads to less work, standardized processing, and specialization among developers.

36. What is the relationship of ODBC, OLE DB, and ADO?

Developed first, the ODBC standard is for relational databases; while the OLE DB standard provides functionality for both relational and other databases. Finally, ADO was developed to provide easier access to OLE DB data for the non-object-oriented programmer.

37. What are the three types of data sources used with ODBC?

An ODBC file data source is a file that can be shared among database users. A ODBC system data source is one that is local to a single computer. A ODBC user data source is only available to the user who created It.

38. What disadvantage of ODBC does OLE DB overcome?

By breaking the features and the functions of a DBMS into COM objects, OLE DB characteristic overcomes a major disadvantage of ODBC. With ODBC, a vendor must create an ODBC driver for almost all DBMS features and functions in order to participate in ODBC at all. This is a large task that requires a substantial initial investment. With OLE DB, however, a DBMS vendor can implement portions of their product.

39. What are to goals of OLE DB?

The major goals of OLE DB are to: (1) Create object interfaces for DBMS functionality pieces; (2) Increase flexibility for developers and users; (3) provide an object interface over any type of data; and (4) do not force data to be converted or moved from where it is.

40. In OLE DB, what is the difference between an interface and an implementation?

An OLE DB interface is specified by a set of objects, and the properties and methods that they expose, and OLE DB defines standardized interfaces. An object need not expose all of its properties and methods in a given interface. An OLE DB implementation defines how the object supports the interface. The implementation is completely hidden from the user. Thus developers of an object are free to change the implementation whenever they want, but they should not change the interface without consulting their users.

41. What are the two means to describe the content of XML documents?

DTD (Document Type Declarations) and XML Schemas. An XML document that conforms to its DTD is called type-valid. A document can be well-formed and not be type-valid, either because it violates the structure of its DTD or because it has no DTD. However, DTD5 have limitations and to overcome these limits XML Schemas were created. XML Schemas are XML documents that are the preferred method for defining document structure.

42. What is the difference between simple elements and complexType elements?

Simple elements have only one data value. ComplexType elements can have multiple elements nested within them. ComplexTypes may also have attributes. The elements contained in a complexType may be simple or other complexTypes. ComplexTypes may also define element sequences.

43. What is a dataset?

A dataset is an in-memory database that is disconnected from any regular database, but has all the important characteristics of a regular database. Datasets can have multiple tables, relationships, referential integrity rules, referential integrity actions, views and the equivalent of triggers. Datasets are needed to provide a standardized, non-proprietary means to process database views. They are especially important for the processing of views with multiple multivalued paths.

44. What is Relationship set?

The collection (or set) of similar relationships.

45. What is Relationship type?

Relationship type defines a set of associations or a relationship set among a given set of entity types.

46. What is DDL (Data Definition Language)?

A data base schema is specifies by a set of definitions expressed by a special language called DDL.

47. What is VDL (View Definition Language)?

It specifies user views and their mappings to the conceptual schema.

48. What is SDL (Storage Definition Language)?

This language is to specify the internal schema. This language may specify the mapping between two schemas.

49. What is Data Storage - Definition Language?

The storage structures and access methods used by database system are specified by a set of definition in a special type of DDL called data storage-definition language.

50. What is DML (Data Manipulation Language)?

This language that enable user to access or manipulate data as organised by appropriate data model.

1. Procedural DML or Low level: DML requires a user to specify what data are needed and how to get those data.

2. Non-Procedural DML or High level: DML requires a user to specify what data are needed without specifying how to get those data.

51. What is DDL Interpreter?

It interprets DDL statements and record them in tables containing metadata.

52. What is Record-at-a-time?

The Low level or Procedural DML can specify and retrieve each record from a set of records. This retrieve of a record is said to be Record-at-a-time.