All Courses

Preparation Guide DBMS Interview Questions: 52 Key Question

Gautam Pardeshi

a year ago

dbms interview questions - blog 

Introduction - 

        Landing your dream job in the field of database management systems (DBMS) can be a challenging and competitive process. With the rise of digitalization and automation, the demand for skilled professionals in this field has skyrocketed, making the job market more competitive than ever. However, with proper preparation and the right set of tools, you can increase your chances of success and stand out from the crowd during a DBMS interview.
       In this article, we have compiled a list of 50+ DBMS interview tips and tricks to help you prepare for your dream job. From understanding the different types of DBMS to anticipating common interview questions, this guide covers it all. Whether you are a seasoned professional or just starting out, these tips will help you showcase your knowledge, skills, and passion for the field, and ultimately land the job of your dreams.

1) What is DBMS?

        A DBMS (Database Management System) is a software application that interacts with end users, other applications, and the database itself to capture and analyze the data. It provides an interface for the users to perform various operations on the data, such as creating, retrieving, updating, and deleting data, as well as managing and enforcing rules for data integrity and security. Examples of DBMS include MySQL, Oracle, and Microsoft SQL Server.

2) What is a database?

        A database is a collection of data that is organized in a specific way, allowing for efficient search and retrieval of the information it contains. Databases can be used to store a wide range of information, including text, numbers, images, and more. They are often used in business, government, and other organizations to store and manage large amounts of data. Common types of databases include relational databases, document databases, and key-value databases.

3) What is a database system?

        A database system is a collection of data that is organized in a specific way, allowing for efficient storage, retrieval, and manipulation of the information. It typically includes a software program for managing the data, as well as the data itself. There are many different types of database systems, such as relational databases, document databases, and key-value stores, each with its strengths and use cases.

4) What are the advantages of DBMS?

There are several advantages of using a Database Management System (DBMS) including:
  • Data consistency and integrity
  • Data security
  • Concurrent access and data sharing
  • Backup and recovery of data
  • Data modeling and schema creation
  • High scalability and performance
  • Reduced data redundancy and improved data organization
  • Support for complex queries and reporting
  • Data independence and abstraction
  • Easy data access and manipulation.

5) What is a checkpoint in DBMS?

           A checkpoint in a database management system (DBMS) is a process that periodically saves the current state of the database to disk, allowing for recovery in the event of a failure or system crash. This process typically includes writing the current transaction log to disk, as well as updating the database files to reflect the current state of the data. Checkpoints can be triggered by a specific event, such as a certain amount of time passing or a certain number of transactions being processed, or they can be triggered manually. The frequency at which checkpoints occur can be configured by a database administrator.

6) When does the checkpoint occur in DBMS?

          In a database management system (DBMS), a checkpoint is a process that periodically writes the current state of the database to disk. This includes the contents of memory buffers and the transaction log, as well as any other information needed to recover the database in the event of a system failure. The exact timing of checkpoints depends on the specific DBMS and can be configured by the database administrator. Typically, checkpoints occur at regular intervals, such as every hour or every day, or when a certain amount of data has been written to the database.

7) What do you mean by transparent DBMS?

           A transparent database management system (DBMS) is a type of DBMS that allows users to access and manipulate data in a database without needing to know the underlying data structures or the specific details of how the data is stored. The DBMS is said to be "transparent" because it hides the complexity of the database from the user and presents the data in a way that is easy to understand and use. This allows users to focus on the data itself, rather than the technical details of how it is stored and accessed.

8) What are the unary operations in Relational Algebra?

In Relational Algebra, the unary operations are:
  • Selection (σ): Selects a subset of rows from a relation that satisfies a given condition.
  • Projection (π): Selects a subset of columns from a relation.
  • Renaming (ρ): Renames the columns of a relation.
  • Set difference (A - B): Returns the rows that are in relation A but not in relation B.
  • Union (A ∪ B): Returns the rows that are in relation A or in relation B.
  • Intersection (A ∩ B): Returns the rows that are in both relation A and relation B.
  • Cartesian product (A × B): Returns the set of all possible combinations of rows from relation A and relation B.
  • Division (A / B): Returns all the tuples from A that can be paired with tuples from B such that the combination of the two tuples is in A.

9) What is RDBMS?

          A relational database management system (RDBMS) is a type of software that is used to manage and organize data stored in a relational database. RDBMSs typically use SQL (Structured Query Language) to create, retrieve, update, and delete data stored in tables, which are organized into rows and columns. Examples of RDBMSs include MySQL, Oracle, and Microsoft SQL Server.

10) How many types of database languages are?

There are several types of database languages, including:
  • Data Definition Language (DDL) - used to define the database schema and create, alter, and delete database objects. Examples include CREATE, ALTER, and DROP.
  • Data Manipulation Language (DML) - used to manipulate the data in the database. Examples include SELECT, INSERT, UPDATE, and DELETE.
  • Data Control Language (DCL) - used to control access to the data in the database. Examples include GRANT and REVOKE.
  • Transaction Control Language (TCL) - used to manage the transactions in the database. Examples include COMMIT and ROLLBACK.
  • Data Query Language (DQL) - used to query the data in the database. Examples include SELECT.
  • Data Mining Language (DML) - used to extract patterns and knowledge from the data stored in the database.
  • Data Warehouse Language (DWL) - used to extract and retrieve data from a data warehouse.
  • Data Stream Language (DSL) - used to extract and retrieve data from a data stream.
It is important to note that not all database management systems support all of these languages and some may have their specific languages.

11) What do you understand by Data Model?

          A data model is a way of representing data in a structured format, typically in the form of a diagram or schema. It defines the relationships between different pieces of data, such as entities and their attributes, and the rules that govern those relationships. Data models are used in a variety of fields, including database design, software engineering, and business analysis, to help organize and understand large amounts of data. They can be implemented using a variety of data modeling techniques, such as entity-relationship modeling, object-oriented modeling, and data flow modeling.

12) Define a Relation Schema and a Relation.

           A relation schema is a blueprint for a relation, which defines the name of the relation and the names and types of its attributes. For example, a relation schema for a "Student" relation might include attributes for a student's name, ID, and major, and specify that the name attribute is a string, the ID attribute is an integer, and the major attribute is also a string.
A relation is a table of data that conforms to a relation schema. It is a set of tuples, where each tuple represents a single record and each attribute in the tuple corresponds to an attribute in the relation schema. For example, a relation conforming to the "Student" relation schema might have the following tuples:
(John Smith, 1, Computer Science)
(Jane Doe, 2, English)
(Bob Johnson, 3, Mathematics)
Each tuple represents a single student and contains the student's name, ID, and major, in that order, as specified by the relation schema.

13) What is a degree of Relation?

          A degree of a relation refers to the number of entities or elements in a given set that is related to one another. In mathematical terms, it is the number of elements in a set that are related to a given element in a binary relation. For example, in a set of people, the degree of relation could be the number of family members each person has. In a graph, the degree of a vertex is the number of edges incident to it.

14) What is the Relationship?

          In a database management system (DBMS), a relationship refers to the connection between two or more tables in the database. These relationships can be defined using keys, such as primary and foreign keys, which link the data in one table to the data in another table. Several types of relationships can exist in a DBMS, including one-to-one, one-to-many, and many-to-many relationships. These relationships help to ensure data integrity and consistency in the database and allow for efficient querying and data retrieval.

15) What are the disadvantages of file processing systems?

Disadvantages of file processing systems in DBMS include:
  • Data redundancy and inconsistency: data may be duplicated in multiple files, leading to inconsistencies and errors.
  • Limited data sharing: data is typically stored in separate files, making it difficult for multiple users to access and share the data simultaneously.
  • Data isolation: changes made to one file may not be reflected in other files, leading to data isolation issues.
  • Data integrity: maintaining the consistency and accuracy of data can be difficult in file processing systems.
  • Concurrent access: multiple users trying to access the same file at the same time can lead to conflicts and data corruption.
  • Security: file-based systems may not have advanced security features, making it easier for unauthorized users to access or modify data.
  • Backup and recovery: it can be difficult to back up and recover data in a file-based system.
  • Scalability: file-based systems may not be able to handle large amounts of data or a high number of users.

16) What is data abstraction in DBMS?

          Data abstraction in a DBMS refers to the process of hiding the internal details of the data storage and management from the users and presenting them with a simplified view of the data. This allows users to interact with the data in a more intuitive and user-friendly way, without needing to know the underlying details of how the data is stored and manipulated. The goal of data abstraction is to provide a level of insulation between the physical data storage and the logical representation of the data, making it easier to change the underlying data storage without affecting the way users interact with the data.

17) What are the three levels of data abstraction?

The three levels of data abstraction are:
  • Physical level: This is the lowest level of abstraction and deals with the physical storage of data on storage devices such as disk drives or memory.
  • Logical level: This level of abstraction deals with the logical representation of data and how it is organized, such as in tables or files.
  • View level: This is the highest level of abstraction and deals with how users interact with and see the data, such as through a specific application or interface.

18) What is DDL (Data Definition Language)?

          Data Definition Language (DDL) is a subset of SQL that is used to define the structure of a database. It includes commands such as CREATE, ALTER, and DROP, which are used to create, modify, and delete database objects such as tables, indexes, and views. DDL statements are typically executed by a database administrator and are used to create and maintain the structure of a database.

19) What is DML (Data Manipulation Language)?

         Data Manipulation Language (DML) is a subset of SQL (Structured Query Language) that is used to manage data within a database. DML includes commands such as SELECT, INSERT, UPDATE, and DELETE, which are used to retrieve, insert, modify, and delete data in a database, respectively. DML is used to manipulate the data in the database, but it does not include commands for creating or modifying the database schema.

20) Explain the functionality of the DML Compiler.

          A DML (Data Manipulation Language) compiler is a software program that converts DML statements (such as SELECT, INSERT, UPDATE, and DELETE) written in a high-level programming language into executable code that can be run on a database management system (DBMS). The DML compiler reads the DML statements, checks for syntax and semantic errors, and then generates an optimized version of the code that can be executed by the DBMS. This process is known as DML compilation. DML compilation is an important step in the process of executing DML statements, as it ensures that the statements are syntactically and semantically correct and that they will run efficiently on the DBMS.

21) What is Relational Algebra?

         Relational algebra is a formal system for manipulating relational databases, based on mathematical concepts such as set theory and logic. It is used to specify queries and operations on relational databases, such as selecting, projecting, and joining tables. Relational algebra provides a set of basic operations, such as selection, projection, and join, which can be combined in various ways to express more complex queries. These operations are used to define relational calculus, which is a more powerful and expressive query language for relational databases.

22) What is Relational Calculus?

         Relational Calculus is a type of formal logic used to specify the structure of a relational database, and the operations that can be performed on it. It is used to define the relationships between different tables in a database, and to express queries that can be used to retrieve data from the database. There are two types of relational calculus: tuple relational calculus (TRC) and domain relational calculus (DRC). TRC uses variables to range over the tuples of a relation, while DRC uses variables to range over the domains of the attributes in a relation.

23) What do you understand by query optimization?

        Query optimization in DBMS refers to the process of improving the performance of a database query by identifying and implementing the most efficient method for executing the query. This can include selecting the most appropriate indexes to use, reordering the query conditions to take advantage of indexing, and selecting the most efficient algorithm for joining tables. The goal of query optimization is to minimize the resources required to execute a query and to return the results as quickly as possible. This can be achieved by using techniques such as query analysis, cost estimation, and rule-based optimization.

24) What do you mean by the durability in DBMS?

         Durability is a DBMS that refers to the ability of the system to persist data, even in the event of a system failure or power loss. This is typically achieved through the use of techniques such as transaction logging, database replication, and backup and recovery procedures. The goal of durability is to ensure that data is not lost or corrupted in the event of a failure and that it can be recovered in a consistent state.

25) What is normalization?

         Normalization is the process of organizing data in a database so that it meets certain constraints. This typically involves creating tables and establishing relationships between them, to reduce data redundancy and improve data integrity. The most common forms of normalization are the first normal form (1NF), second normal form (2NF), and third normal form (3NF).

26) What is Denormalization?

         Denormalization is the process of adding redundant data to a database to improve performance. This is often done in situations where the normalized schema (one that follows the rules of normalization in database design) is causing slow query performance due to many joins or other complex relationships between tables. By denormalizing the data and adding redundancies, the database can be optimized for read performance, but at the cost of increased write complexity and the need to manually maintain consistency of the data.

27) What is functional Dependency?

        Functional dependency is a relationship between two attributes in a database, where the value of one attribute determines the value of another attribute. It is a fundamental concept in database design and normalization and is used to ensure data integrity and reduce data redundancy. A functional dependency is typically represented as "X -> Y," where X is the determinant attribute and Y is the dependent attribute.

28) What is the E-R model?

         The E-R model, short for the Entity-Relationship model, is a method used in software engineering to represent and organize data in a database. It is used to model the relationships between entities, which can be defined as real-world objects or concepts, and the attributes that describe them. The E-R model is represented using a diagram that shows entities as rectangles and their relationships as lines connecting them. The E-R model is used to design and implement databases in a logical and structured way, making it easier to understand and maintain the data.

29) What is an entity?

        An entity is a person, place, thing, or concept that is represented in a database or information system. It is an abstraction of real-world objects and concepts and is used to model and organize data. In the context of databases, an entity is often represented as a table or collection of data, with each row representing a specific instance of the entity. In artificial intelligence, an entity can refer to a real-world object or concept that is recognized and processed by the AI system.

30) What is an Entity type?

          An entity type is a category of things or objects that have certain characteristics or attributes in common. In natural language processing (NLP), an entity type is often used to refer to the class or category of a named entity, such as a person, location, or organization, that is identified in the text. For example, in the sentence "Barack Obama was the President of the United States," "Barack Obama" would be identified as an entity of type "person" and "United States" would be identified as an entity of type "location."

31) What is an Entity set?

         An entity set is a collection of similar entities. In the context of databases, an entity set is a table that contains a set of records (or rows) that have the same attributes (or columns). Each record represents an instance of the entity, and the attributes contain the data for each instance. For example, a "customers" entity set would have records for each customer, with attributes such as name, address, and phone number.

32) What is an Extension of entity type?

          An extension of an entity type is a subtype of an existing entity type in a data model. It allows for additional attributes or behaviors to be added to the parent entity type, while still maintaining the relationships and constraints of the parent type. It is a way to create a more specific version of a more general entity type, while still maintaining the relationship between the two.

33) What is a Weak Entity set?

           In database management systems (DBMS), a weak entity set is a set of entities that do not have a primary key attribute of their own but rely on the primary key of another entity, known as the owner or dominant entity. A weak entity set is represented by a double rectangle in an entity-relationship diagram, with a dashed line connecting it to the dominant entity. The primary key of a weak entity set is a combination of the primary key of the dominant entity and its own attributes.

34) What is an attribute?

         In a database management system (DBMS), an attribute is a characteristic or property of an entity that is being stored in the database. For example, in a database of customers, the attributes of a customer might include their name, address, and telephone number. Attributes are often defined as part of the database schema, which outlines the structure and organization of the data stored in the database. They are usually also considered columns in a table.

35) What are the integrity rules in DBMS?

          In a database management system (DBMS), integrity rules are used to ensure that the data stored in the database is accurate, consistent, and reliable. These rules can be divided into two main categories: entity integrity and referential integrity.
  • Entity Integrity: This rule ensures that the primary key in a table is unique and not null.
  • Referential Integrity: This rule ensures that there are no orphan records in a table that refers to a non-existent record in another table.
Other integrity rules are:
  • 3. Domain integrity: This rule ensures that the data entered into a column is of the correct data type and adheres to any constraints, such as minimum and maximum values.
  • User-defined integrity: This rule allows user-defined constraints to be applied to the data, such as check constraints, triggers, and stored procedures.
  • These integrity rules help to maintain the consistency and accuracy of the data in the database and to prevent errors and inconsistencies from occurring.

36) What do you mean by extension and intension?

         In database management systems (DBMSs), the terms "extension" and "intension" are used to refer to the properties of a database schema or table.
  • The extension of a table refers to the set of all data that currently exists in the table or the set of all rows in the table. The extension of a table can change over time as new rows are added or existing rows are deleted.
  • The intension of a table, on the other hand, refers to the definition or structure of the table, including the names and data types of the columns and any constraints or rules that apply to the data in the table. The intension of a table typically remains constant over time.
So in short, extension refers to the actual data stored in the table and intension refers to the structure of the table.

37) What is System R? How many of its two major subsystems?

       System R was a research project at IBM San Jose Research Laboratory in the 1970s that aimed to create a working prototype of a relational database management system (DBMS). It was one of the first major research projects to focus on the relational model for databases, which was proposed by Edgar Codd in 1970.
There are two major subsystems in System R: the Relational Data System (RDS) and the System R Access Method (SRAM). The RDS is responsible for managing the data stored in the database, while the SRAM is responsible for managing access to the data by the different applications that use the database.

38) What is Data Independence?

        Data independence refers to the ability to change the schema of a database without affecting the application programs that use the data. There are two types of data independence: logical data independence and physical data independence. Logical data independence refers to the ability to change the logical structure of the data without affecting the application programs that use it. Physical data independence refers to the ability to change the physical storage of the data without affecting the application programs that use it.

39) What are the three levels of data abstraction?

The three levels of data abstraction are:
  • Physical level: This is the lowest level of abstraction and deals with the physical storage of data, such as the location of data on a disk or the layout of data in memory.
  • Logical level: This level of abstraction deals with the logical organization of data, such as the structure of tables in a relational database. It describes what the data represents, but not how it is stored.
  • View level: This is the highest level of abstraction and deals with the presentation of data to the user. It describes how the data is presented, such as in the form of a report or a graph.
All these levels of data abstraction are used to hide the complexity of data storage and organization, making it easier for users to interact with the data.

40) What is Join?

         In a database management system (DBMS), a join operation combines rows from two or more tables based on a related column between them. The result is a new table containing all the columns from the joined tables, with rows for each combination of matching rows from the input tables. There are several types of joins, including inner join, outer join, and cross join. The specific type of join used will depend on the desired outcome and the structure of the tables being joined.

41) What is 1NF?

        1NF stands for "First Normal Form" in database management. It is a basic level of normalization in which each cell in a table contains only atomic (indivisible) values, and there are no repeating groups of data. In other words, each column in a table represents a single attribute of the entity represented by the table, and each row represents a single instance of that entity. This helps to minimize data redundancy and improve data integrity

42) What is 2NF?

         2NF stands for Second Normal Form. In database design, it is a level of normalization in which a table is in 2NF if it is in 1NF (First Normal Form) and all non-key columns are functionally dependent on the primary key. This means that each non-key column in the table is dependent on the entire primary key, and not just a part of it. This helps to eliminate data redundancy and improve data integrity

43) What is 3NF?

         3NF stands for Third Normal Form. It is a database normalization rule that states that a table should not contain any non-primary key columns that are not dependent on the primary key. In other words, all non-primary key columns in a table should be functionally dependent on the primary key. This helps to minimize data redundancy and improve data integrity.

44) What is BCNF?

           BCNF stands for Boyce-Codd Normal Form. It is a normal form in database theory that is used to test for a certain type of redundancy in the data stored in a relational database. A table is in BCNF if and only if every non-trivial functional dependency X → Y in the table is such that X is a superkey for the table. In other words, a table is in BCNF if and only if every determinant is a candidate key.

45) Explain ACID properties

         ACID properties refer to a set of properties that guarantee that database transactions are processed reliably. These properties are:
  • Atomicity: Ensures that a transaction is treated as a single, indivisible unit of work. Either all the changes made in a transaction are committed to the database, or none of them are.
  • Consistency: Ensures that a transaction brings the database from one valid state to another. The rules that define what constitutes a valid state are called integrity constraints.
  • Isolation: Ensures that concurrent execution of transactions results in a system state that would be obtained if transactions were executed sequentially.
  • Durability: Ensures that once a transaction has been committed, its effects are permanent and survive any subsequent failures.
In summary, ACID properties are a set of properties that ensure that database transactions are reliable, consistent, and isolated and that the effects of transactions are permanent.

46) What is the stored procedure?

          A stored procedure is a precompiled collection of SQL statements that can be executed on a database server. They are typically used to encapsulate a set of operations that are performed frequently or that need to be executed with a specific set of parameters. Stored procedures can accept input parameters and return multiple output values, and they can be called from a variety of programming languages and frameworks. They can improve database performance and security since they are pre-compiled and executed on the server.

47) What is the difference between a DELETE command and a TRUNCATE command?

        In a database management system (DBMS), the DELETE command is used to delete specific rows from a table, while the TRUNCATE command is used to delete all rows from a table.
The DELETE command can be used to delete specific rows based on a WHERE clause, while TRUNCATE is used to delete all rows from a table. Additionally, DELETE generates a rollback log and triggers a delete trigger, while TRUNCATE does not and it is much faster than DELETE.

48) What is 2-Tier architecture?

          2-Tier architecture in databases refers to a system where the database management system (DBMS) runs on a separate server from the application that accesses the database. In this architecture, the application communicates directly with the DBMS through a network connection, sending SQL queries and receiving results.
In a 2-Tier architecture, the application and the DBMS are tightly coupled, meaning that changes to one will likely affect the other. This can make it difficult to scale the system or make changes to the database without affecting the application.
Examples of 2-Tier architectures include client-server systems where the client is a desktop application and the server is a database management system or web applications where the application runs on a web server and the database is on a separate server.
Overall, 2-Tier architecture is a simple approach to database management, but it may not be the best option for systems that need to handle a high volume of requests or need to be easily scalable.

49) What is the 3-Tier architecture?

          The 3-tier architecture is a software design pattern that separates an application into three logical layers: the presentation layer, the application logic layer, and the data storage layer. The presentation layer, also known as the user interface, is responsible for displaying data to the user and accepting input. The application logic layer, also known as business logic, is responsible for processing data and making decisions based on that data. The data storage layer, also known as the database, is responsible for storing and retrieving data. This separation of concerns allows for greater flexibility, scalability, and maintainability in the development and deployment of an application.

50) How do you communicate with an RDBMS?

          An RDBMS (Relational Database Management System) can be communicated using SQL (Structured Query Language). SQL is a programming language that is used to manage and manipulate relational databases, including creating, reading, updating, and deleting data in the database. This communication can happen through a variety of methods such as command line interface, query tools, or by using a programming language such as Python, Java, C#, etc to communicate with the database using an appropriate driver or library.

51) What is the difference between a shared lock and an exclusive lock?

         A shared lock allows multiple threads to read a resource simultaneously, while an exclusive lock allows only one thread to access a resource (for reading or writing). A shared lock can be upgraded to an exclusive lock, but an exclusive lock cannot be downgraded to a shared lock.

52) Describe the types of keys?

In DBMS (Database Management Systems), several types of keys are used to organize and access data:
  • Primary Key: A primary key is a unique identifier for each record in a table. It is used to uniquely identify a record and cannot be null.
  • Foreign Key: A foreign key is a field in a table that is used to establish a link between the data in two tables. It is used to reference the primary key of another table.
  • Candidate Key: A candidate key is a set of one or more fields that can be used as a unique identifier for a record in a table. A table can have multiple candidate keys, but only one primary key.
  • Composite Key: A composite key is a combination of two or more fields that are used together as a primary key for a table.
  • Alternate Key: An alternate key is a candidate key that is not chosen as the primary key for a table. It can still be used as a unique identifier for a record in the table.
  • Surrogate Key: A surrogate key is a unique identifier for a record in a table, usually a number or a code, that is used as the primary key instead of using a natural key, which is a column that has a logical relationship to the table.

Summary:

          The article provides 50+ tips and tricks for individuals preparing for a database management system (DBMS) interview. The tips cover various aspects such as knowledge of database concepts, data warehousing, SQL commands, performance optimization, and more. The article also provides advice on how to handle behavioral questions and technical questions, as well as how to present oneself professionally during the interview process. The ultimate goal of the article is to help individuals land their dream job in the field of DBMS.

Submit Review