How to Fix a Corrupt MSDB SQL Server Database

Posted on

Problem

You have to fix a corrupt SQL Server msdb database and to make things worse, there is no recent database backup. If you run setup with the option REBUILDDATABASE you will also rebuild the master database and lose all the logins unless you backup and restore the master database. Is this the best option? Keep reading and you will see that there is another way to rebuild the msdb database without compromising the master database.

Solution

Dealing with corruption of system databases is one of the most feared situations for SQL Server DBAs. Luckily, there are a few different options to fix the corrupt SQL Server instance to get back up and running. You can get a list of those methods in John Grover’s tip on How to recover a suspect msdb database in SQL Server. The last method John proposes in his tip is to “Use a SQL Server Template MSDB Database”, but this option has two paths depending on what SQL Server version you are working with. If you have an instance of SQL Server 2008 or later, you can copy the database template in the Binn\Templates subfolder of the instance root directory. But if you are working with a previous version then you have to use the method that I will expose in this tip.

Most of the articles about msdb database corruption refer to a physically corrupted database, but what if instead you have logical corruption? I mean, for example if someone accidentally or deliberately modified some of the stored procedures of the msdb database. In that case, using the template msdb database won’t be the best approach because you will lose all the job definitions, Integration Services Packages stored in msdb, etc.

The possible solution to the previous scenario relies on the Install subfolder of SQL Server instance root directory, which is usually C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Install. There is a file in that folder named instmsdb.sql which is the msdb creation script. Let’s see how we can use it.

Option 1 – Rebuild a Corrupt msdb SQL Server Database

This procedure is quite straightforward. It consists of starting the SQL Server instance with trace flag 3608. This trace flag prevents the instance from automatically recovering any database except master. But databases will be started and recovered when accessed, so if we want to rebuild the msdb database it is a wise to avoid connections to that database.

The first step is to stop the SQL Server Instance and all the dependent services like Integration Services, Analysis Services, Reporting Services and any other application that use SQL Server. You can use any administrative tool for that purpose like the Services Manager. The screen shot below shows a command window with the commands (i.e. NET STOP) I have used to stop the SQL Server services.

Stopping SQL Server Services.

Now we are going to start the SQL Server service with the following parameters.

net start MSSQLSERVER /T3608 /f /m"SQLCMD"

The /T3608 parameter enables the trace flag 3608 to prevent the msdb database from starting up. Notice that the “T” letter is uppercase; most of the startup flags are specified with an uppercase “T”, there are some others that use a lowercase “t”, but are usually used by Microsoft technical support only. The “/f” parameter is to start SQL Server service with its minimal configuration and in single user mode; we want to be the only ones to access the instance. The last parameter /m”SQLCMD” tells the instance to only accept connections for an application that identifies as SQLCMD. That’s to avoid other applications to connect to the instance.

After the instance starts up, we open a Command window and change the folder to the install folder of the instance root directory and from there we connect to it with the SQLCMD application and detach the msdb database.

SQLCMD -E -S  -Q"EXEC sp_detach_db msdb"

The –E parameter is used to establish a trusted connection, you can use a SQL Server login like sa if you want to. The –Q parameter executes the quoted commands and exits.

Now you can stop the instance and start it up again normally, I mean without any special consideration.

The next step is to rename the msdb database files; otherwise the next step will fail. By default these files are in the DATA sub-folder for the SQL Server instance.

If you try to connect to the instance using SQL Server Management Studio you will get an error message, but you can still use SQL Server Management Studio by selecting New\Database Engine Query, like on the next image.

Connecting With SQL Server Management Studio.

Now to recreate the msdb database you can use the next SQLCMD command.

SQLCMD -E -S -iinstmsdb.sql –ologfile.txt

The “-iinstmsdb.sql” parameter instructs SQLCMD to execute the instmsdb.sql script, of course you have to be in the same folder as the script. The “–ologfile.txt” option will save the output to a file named logfile.txt in the same folder.

Also you can execute this script by dragging the instmsdb.sql file into SQL Server Management Studio. The next image is a screen capture showing the execution.

instmsdb.sql Script Execution on SQL Server Management Studio.

Option 2 – Recover Stored Procedures in the SQL Server msdb Database

If your msdb database has altered stored procedure and you want to get them back to the original version, you can get the original code from the instmsdb.sql file. I tested that if you run the instmsdb.sql script it won’t drop and recreate the system tables in msdb if they exist, but it will drop and recreate the stored procedures. The next script was copied from instmsdb.sql file and intends to create the sysjobs table. Notice that the CREATE TABLE statement is inside an IF block that is executed if the sysjobs table doesn’t exist, so it doesn’t try to recreate the table.

/**************************************************************/
/* SYSJOBS                                                    */
/**************************************************************/

IF (NOT EXISTS (SELECT *
                FROM msdb.dbo.sysobjects
                WHERE (name = N'sysjobs')
                  AND (type = 'U')))
BEGIN
  PRINT ''
  PRINT 'Creating table sysjobs...'

  CREATE TABLE sysjobs
  (
  job_id                     UNIQUEIDENTIFIER NOT NULL,
  originating_server_id      INT              NOT NULL, -- REFERENCE enforced by trig_sysjobs_insert_update
  name                       sysname          NOT NULL,
  enabled                    TINYINT          NOT NULL,
  description                NVARCHAR(512)    NULL,
  start_step_id              INT              NOT NULL,
  category_id                INT              NOT NULL,
  owner_sid                  VARBINARY(85)    NOT NULL,
  notify_level_eventlog      INT              NOT NULL,
  notify_level_email         INT              NOT NULL,
  notify_level_netsend       INT              NOT NULL,
  notify_level_page          INT              NOT NULL,
  notify_email_operator_id   INT              NOT NULL,
  notify_netsend_operator_id INT              NOT NULL,
  notify_page_operator_id    INT              NOT NULL,
  delete_level               INT              NOT NULL,
  date_created               DATETIME         NOT NULL,
  date_modified              DATETIME         NOT NULL,
  version_number             INT              NOT NULL
  )

  CREATE UNIQUE CLUSTERED INDEX clust ON sysjobs(job_id)
  CREATE NONCLUSTERED     INDEX nc1   ON sysjobs(name) -- NOTE: This is deliberately non-unique
  CREATE NONCLUSTERED     INDEX nc3   ON sysjobs(category_id)
  CREATE NONCLUSTERED     INDEX nc4   ON sysjobs(owner_sid)
END
go

In contraposition, the next script also extracted from instmsdb.sql script creates the sp_agent_start_job stored procedure. Notice that before the CREATE PROCEDURE there is an IF block that drops the sp_agent_start_job stored procedure if it exists.

PRINT ''
PRINT 'Creating procedure sp_agent_start_job...'
IF (NOT OBJECT_ID(N'dbo.sp_agent_start_job', 'P') IS NULL)
  DROP PROCEDURE dbo.sp_agent_start_job
GO

CREATE PROCEDURE dbo.sp_agent_start_job 
  @job_id      UNIQUEIDENTIFIER
AS
BEGIN
    DECLARE @retval INT 

    EXEC @retval = sys.sp_sqlagent_start_job @job_id

    RETURN(@retval) -- 0 means success 
END
GO

The best part of this fix is that you don’t have to stop the instance and start it with a trace flag in order to execute the instmsdb.sql script. The next three images are screen captures that I took showing the contents of the sysjobs table before executing instmsdb.sql script, the script execution and the contents of the sysjobs table after executing instmsdb.sql script.

Here we can see that there is one job called SampleJob.

Contents of sysjobs Table Prior to Running instmsdb.sql Script.

In the next screen shot we see the execution of instmsdb.sql.

Execution of instmsdb.sql Script.

In this final screenshot after instmsdb.sql has been run, we can see that job SampleJob still exists.  So all of the data stays intact after running instmsdb.sql against an existing msdb database.

Contents of sysjobs Table After Executing instmsdb.sql Script.

Leave a comment