Call us today, toll free:
888-7649350

MSSQL 05 Transaction Log and Data migration.

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length

 
Advanced search

4338 Posts in 706 Topics- by 1723 Members - Latest Member: FeariaUsakatt

January 07, 2009, 04:18:38 AM
Live! ForumTechnical supportGeneral supportMSSQL 05 Transaction Log and Data migration.
Pages: [1] |   Go Down
Print
Author Topic: MSSQL 05 Transaction Log and Data migration.  (Read 650 times)
Cognecy Solutions, LLC
Full Member
***

Karma: +0/-0
Posts: 11


« on: June 02, 2008, 11:27:11 PM »

I was In the middle a synchronized data migration on mssql03 and had the Transaction Log on filling up on me WAY too prematurely.  This SQL Server (mssql03) did this on a previous customer's data migration as well.  It is also adding in an enormous amount of what I call, "white space", to the DB during the data transfer that artificially makes the DB much larger than it really is. Currently, the only resolution I have is to submit a ticket to have the Shrink DB run from a SA log-on and have the transaction log dumped.  Since I am often under a time table to get a DB migration completed, it is often counter productive to wait for a ticket to be completed since, depending on the amount of tickets ahead of me, I could be waiting a while. As it is, when the transaction log fills up, the system is not even letting me run simple scripts much less run the shrink command or backup (which typically dumps the TL).  During a data migration I really do not need to keep any of the data in the transaction log once it fills up.   

As a large part of my businesses service offering (about 60% for this one application) I do database migrations into the MSSQL platform .  In fact I am the only vendor the developer of this e-commerce application recommends for this type of work which keeps me pretty busy.  This is what actually drove me toward hosting my customers sites (my customers were begging me). I expect a very good chunk of the DB migrations I have done over the last few years to he hosting their sites with me soon.  Anyway, the point is this: I cannot recall in the last several years when I have seen a transaction log in MSSQL fill up this fast during a synchronized data migration.  As a rule, I do not use wizards or DTS to migrate data as both are extremely unreliable and both cause a lot of index fragmentation.  I almost always use our own synchronization protocol that is typically less burdensome on the transaction log and causes next to no 'additional' DB fragmentation or white space (artificial DB size)...except for when I run it on this server.

I specialize in database migration...Not server setup (That is you guys).  I will be the first to admit that when it comes the setting up the server (SQL or Windows), I am a novice at best....I'm just the application guy that found a niche I was very good at in database migration.  At this point I would even be willing to purchase extra dedicated storage for nothing but my customer's SQL transaction logs (but I hope we can come up with an alternative solution for now until I can get more customers signed up.)

I did find one clue in the Default SQL Server Collation that is a bit concerning from a DBA's perspective (may or may not be the culprit)...

I noticed the default server collation on mssql03 is set to Latin1_General_CS_AI on mssql03.  The de-facto "norm" for Shared DB SQL Servers (at least in English speaking countries) is SQL_Latin1_General_CP1_CI_AS.  I have seen a few places that have different collations as their default but by a wide margin, hosts in the USA, Canada, and England are using the SQL collation I mentioned.   Having the default collation on the server set to "Case Sensitive" is a pretty big concern and can cause some headaches in a shared environment especially if you need to run any kind of sorting or grouping scripts (tempdb uses the collation of the model db which is likely not going to match or be compatible with the user's database collation if it is supporting a standardized application such as a commercial e-commerce software).  As far as having "AI" instead of AS, that is not as big of a concern as 'not using an SQL collation' and having CS (case sensitive) set in the default.  I think many DBAs would consider that risky for a shared environment since it is not as universal in the 1252 language set.

Question:
Is EIRCA set on using Latin1_General_CS_AI? (understanding what a pain it is to change the server collation after the server is up and running with DB's on it). 

If so, I would definitely recommend monitoring it closely and periodically polling the other DB's properties to see what the most common collation is that your users are using for their applications.  I did a test recently on my own local SQL server I have running in my office and found that there is a marked difference between the Latin1_General_xx_xx and the SQL_Latin1_General_CP1_xx_xx collations.  THey are definitely not equivalent.  I was running a simple script that cleans up duplicate row entries and SQL server could not resolve the diffences beteween the SQL collation and the non-SQL collation that were both set to CI_AS.

Just looking for a simple solution

Mark Shipp
Cognecy
Logged

Mark Shipp
Chief Executive Manager
Cognecy Solutions, LLC
Jason
Administrator
Sr. Member
*****

Karma: +0/-0
Posts: 39


WWW
« Reply #1 on: June 03, 2008, 12:20:58 AM »

Hi Mark,

Latin1_General_CS_AI is what Microsoft recommend for a shared hosting environment; changing a specific database collation takes around 10 seconds, it can be done directly through most sql management programs.

We can't force people to lose (via a backup schedule for example) their transaction logs (there would be a riot). We truncate the logs for unused data on a set schedule per http://support.microsoft.com/kb/873235 but instant gratification for removing SQL logs has always been this way. If your log is too big for the import, I can only suggest raising the DB and/or log size, however temporary that setting may be. It's just a by-product of importing a large database in to SQL Server 2005.

Based on Microsoft's advice, the collation was set. Out of curiosity, I just ran a check on this DB server to see what the most commonly used collation was and it does appear to be Latin1_General_CS_AI that is most commonly used. If the trend changes then we'll reevaluate the situation like with most other software instances.

Like always, if you'd like to discuss your options with this instance, please create a ticket so one of our server administrators can discuss it with you. They'll be happy to, for sure.

Thanks!
Logged

Jason S.
EIRCA Signature Support
EIRCA Shared, Reseller and Managed dedicated hosting: www.EIRCA.net
Give us your feedback: http://feedback.eirca.net/survey.php?surveyid=2
Cognecy Solutions, LLC
Full Member
***

Karma: +0/-0
Posts: 11


« Reply #2 on: July 02, 2008, 02:51:02 AM »

Thanks Jason for your reply and I apologize that it has taken me a while to get back on the forum.

I hope that is is okay to give a bit more background on the subject in this forum. I see collations problems several times a week with customer databases(not necessarily on EIRCA servers) and it is a piece that is often overlooked and mis-understood when migrating to a newer platform (SQL2000-SQL2005) or even migrating from one hosting environment to another (SQL2005-to-SQL2005)

With regard to the earlier comment that the most common collation is Latin1_General_CS_AI, my speculation is that is what is looks like on the EIRCA servers primarily because that is what the EIRCA server's master collation is set to so therefore, all new databases are being created will automatically inherit this collation.

The typical user of SQL databases in a hosted environment will not have DBA experience and will not know this to be a restrictive collation (or even bother to check the collations) in terms of backward compatibility to migrated SQL2000 databases.  If a new database is set up with that collation there will likely be no harm as long as they build their application around it or make accommodations.  However, if an old database (sql2000) is migrated into a non-backward compatible environment, the potential for disaster and downtime is great.

What I was pointing out in my first post was that in a hosting environment there will be a very large mix of different databases from different applications and in fact, the most common "Industry" collation is SQL_Latin1_General_CP1_CI_AS. The "SQL" delimited collation really began to be exploited with the release of SQL2005 because the rules for collation changed drastically with that release.  Now old SQL2000 based applications have to be concerned with migration and backwards compatibility when moving to SQL2005. The non-"SQL" delimited collations are specifically used for a clean installation of SQL2005 when there is no concern or desire for backwards compatibility. The whole purpose of the "SQL" delimited collations is for that backwards compatibility.  With the huge amount of SQL2000 databases still in use and needing to be migrated on to the newer platform (more so now with the release of SQL2008), you can see that it is important to make this transition as smooth as possible for end users in a shared environment.

The recommendation from Microsoft (with regard to SQL2005 collation)is hard coded right into the default instance of SQL2005 when it is installed new.  By default, SQL2005 is configured to run in “SQL_Latin1_General_CI_AS”, or [SQL Server; Latin1 General; Case Insensitive; Accent Sensitive]. I have not personally found there it be a great deal of difference on whether or not specifying "Code Page 1" (CP1) in the collation  makes a difference but the norm in SQL shares hosting environments is to specify it.  I think this is primarily for the sake of precision and again, compatibility with old databases being migrated on to the newer platform.  This default mode and all other modes beginning with “SQL”, are SQL 2000 compatibility modes.  All of the compatibility modes, in my opinion, will be important to use as long as there are still a large amount of SQL2000 deployments out there.) If you need to configure SQL server to talk with 2000 database servers and imported/migrated databases, the default mode is absolutely the one you want. If you want a clean SQL 2005 installation with no remnants of SQL2000, then the 2005 equivalent “Latin1_General_CI_AS” will serve you better (best used for individual instances of SQL - not shared).  Please note that in either case, it is CI_AS and not CS_AI.  This is not an accident: the forethought being that it(I_AS) is the most generic collation available for the 1252 character set accommodating English, Spanish, and French (Latin character based languages) with varying degrees of capitalization and accent rules in the written languages respectively. 

Choosing the default collation mode for SQL 2005 is critically important, as it affects tempdb (the temporary tables), master, and all newly created databases. The only scenario where I can see a "Case Sensitive" object as necessary would be if the SQL server is set up in a non-shared environment and the databases there-in handle a large amount of  things like case-sensitive passwords. Even then, there are a multitude of other methods to achieve case-sensitivity for those scenarios without resorting to placing this type of restriction on the master database.


I guess if I could make a request in this forum, it would be that this issue be opened for review with the EIRCA development/deployment team to consider changing the default Server collations.

As for a recommendation to new and/or existing customers thinking of migrating to SQL2005 and reading my endless diatribe, it would be to pay close attention to your old collations and be sure to match the collation of the new SQL2005 DB to the old collation prior to migration.  If you have any encryption in your database, keep in mind that encryption was created using a specific set of character rules and if you migrate that data into a new DB with a different collation, you will likely have problems revolving that encryption.  The good news in all this is that you do have the ability to adjust those database specific collations on your individual database irregardless of the SQL Server collation.
Logged

Mark Shipp
Chief Executive Manager
Cognecy Solutions, LLC
Pages: [1] |   Go Up
Print
Jump to:  

Recent forum discussions