Tuesday, March 31, 2009

Tracking Down RTE: 35601 Element Not Found

We recently turned on some new features of a product we were using and would get this RTE on a few records within the application. The number of problem records were small, the total number of records worked with in a day made them a regular occurrence.

I went over to our test environment. It was up to current patch levels, but the data was a few months old. Almost all of my broken records in live were broken in test. That gave me hope that its an existing data issue and not something actively breaking records.

The nice thing about our test environment is that I am usually the only person using it. That makes it easy to toss the SQL Profiler on with out having to filter it to just my traffic. I started the profiler at the point just before clicking the button that generated the RTE and paused it just after the message came up. I closed the app and repeated this process a few times for both working and broken records.

Looking at the queries that each ran, the broken records all stopped at the same point after the same query was run. That tells me that the application is processing the queries as it runs them and the last one before it stopped is my suspect query. If both the broken and the working records had the same full list of queries I would have had to check them all. That would have indicated to me that it pre-loads most of the data before processing it. I was in luck.

The query was a large one but the only difference between all records was the record ID. So it is the same query running for both working and non-working records. Running the query for all of them gave me different sized record sets. Some had more a lot of records and some had very few. One thing did jump out at me as I looked at the raw data.

A few null values jumped out to me in a column that looked like it should have a value. With my experience with the data I knew it should have had a value. I saw one column that had data in it directly related to reference I expected instead of that null value. Because this looked like a complicated query, my first guess was a bad join.

That column was not the only one with nulls in it but every time it had a null the 3 columns to the right also had nulls. That's what I expect a bad join to look like. So I started reformatting the query with proper indents and white space. It did contain a few union calls. The first query in the union practically returned the same results. It looked the same except for a few less rows. My null values were still mixed into the good data.

I focused on the table columns that were null and the one column that contained directly related information. The joins between them looked correct. Exactly how I would have written them. After rewriting the query as simple 2 table join that used the same logic, I ended up with the same results. It was a join that I expected to be valid every time but in this case there are times where its not.

It turned out to be a flaw in the application logic where it allowed some changes to one set of tables that should have enforced changes in another. But they were enforcing them at the application level instead. Other parts of the system enforced it correctly, this one point in the application did not. It overlooked the fact that it needed to enforce it and I almost expect the developer to make the same assumption I made.

I know this is a little different from my usual problem/solution posts but this was mostly an internal product and I wanted to document the process I took to solve it.

Friday, March 27, 2009

Quiet/Loud Game

I posted this in another forum and got a lot of good feedback on it. Here is a more formal write up on a game I play with my kids at home. The Quiet Loud Game.

I have 2 young kids at home. One is almost 4 and the other is 17 months. The 4 year old would often make lots on noise like kids do and our attempts to quiet her down were not working very well. Our problem was she didn't truly understand the concept of quiet. She knew she was doing something wrong, but had no idea what. That did not work very well for either of us. I came up with a new game to play.

I turn the radio on to some music and turn the volume up a bit. We run around the house being loud. I yell the work loud or noisy several times and we all jump, and stomp, and scream, and yell, and slam doors, and bang on the floor. Then we start a loud count down from 3, 2, 1 and then I whisper the word quiet as I turn the volume on the radio way down.

Then we all whisper, and tip toe, and sneak, and shush, and softly open and close doors. Only making quiet sounds. Then we start a soft count down from 3, 2, 1 and then I yell the work loud as I turn the radio back up.

We repeat this process several times until someone starts to wear out (usually me) and we end it while we are all quiet. I tell them they did a great job and we all give each other high fives.

Not only does this help teach them the concept of loud and quiet, it also gives them an outlet for that noise. And we all had a lot of fun doing it. I made a special point to do all the stuff we consider loud when doing the loud part.

Sunday, March 22, 2009

Left/Right Airplane Game

My little girl loves airplaines. We live close enought to an airport that she can spot them in the sky all the time. Once of her favorite games is to fly like an airplaine on my sholder. I holder her up with her legs out behind and her arms out to the side. She tells me where to go as we fly around the house.

Recently I changed it so she has to say Left or Right. As soon as she says a direction, we go that way. Even if she is pointing the other way. After doing this a few time, she was pointing the same direction she was saying. It worked out well and we had fun doing it.

Monday, March 16, 2009

W32.Downadup.C is in the wild. It looks like Conflicker is starting to evolve.

It recently received a new set of instructions that are designed more to protect it then to make it spread any more. It is continuing to attack antivirus software that is used to clean it up. Just as the security industry has gotten into its system of communicating with its self, the virus has gotten a new algorithm that makes it 200 times harder.

This cat and mouse game is about to get interesting. We have yet to see a payload from this virus. There is no question that the attack has changed at this point. Without using any new attack to build a larger infection base, it has to hold what it has already. You don’t fight that hard to keep a system infected unless you have a plan for it later.

https://forums2.symantec.com/t5/Malicious-Code/W32-Downadup-C-Digs-in-Deeper/ba-p/393245%3bjsessionid=1A353B585C96A581ECB9D3536C31ADCB

VB Dates and DBNull.Value SQL

One think I find that I have to look up often is how to work with database dates in vb.net code. I can never remember the best way to do it and end up looking at older code all the time to do it.

Dates are not usually the problem as much as empty dates. Moving an empty date from vb to SQL and back again. I don't expect my solution to be the best solutions. Personally I feel its too complicated so there had to be a better solution that I just have not found yet. With that said, here is how I handle null dates in vb.net.

I don't like working with null values when crossing between applications and code. Mostly because a blank value can serve the same purpose with less code. Dates are the one place I can't use a blank value and it must be null. I will show the use of both strings and dates so you can easily see how they differ.

Here is how I grab values from a DataRow object called row:

dim name as string = row("Name").ToString()
dim DOB as Nullable(Of DateTime) = row.Field(Of Nullable(Of DateTime))("DOB")

My use of ToString() is a quick way to convert DBNull strings into a empty string. I could store my date as a string and do the same thing, but there are times I want to keep it as a datetime object. I could use a plain DateTime and just check the value as I read it for DBNull and adjust to that. Or I can use a Nullable DateTime object to do the same thing for me.

When I am saving the values, I do check for values when I add them as Parameters. Here is how that code looks like:


Dim command As New SqlClient.SqlCommand()

command.Parameters.AddWithValue("@Name", Name)
If DOB Is Nothing Then
command.Parameters.AddWithValue("@DOB", DBNull.Value)
Else
command.Parameters.AddWithValue("@DOB", DOB)
End If

One other thing that you need to watch is using DBNull values in a where statement. By default NULL=NULL is false. You have to check for IS NULL instead. If you want more details on why that is look up ANSI_NULLS. I mention this because its very common to want to use a parameter in a WHERE statement. If you don't craft it correctly, you will not get the results you expect when the value is NULL.

In your mind you expect this to work:

SELECT * FROM people WHERE DOB = @DOB

You want to use a DBNULL to find all the people without a DOB but it will return 0 records. I bet you could mess with the ANSI_NULL option or change the query to this:

SELECT * FROM people WHERE (DOB = @DOB OR (@DOB IS NULL AND DOB IS NULL))

This will do the normal check on the param, but will also check to see if both are null. I know I am doing a lot of extra work someplace, but this does work for me. Turning off the ANSI_NULL option will make the first query work, but I have not looked into it enough to know if it has any negative effects.

Wednesday, March 11, 2009

New transaction cannot enlist in the specified transaction coordinator

I was tying to set up a special feature in the software I work with that involved linked servers. This was test setup so I configured some SQL servers I already had set up and ran into this error.

[OLE/DB provider returned message: New transaction cannot enlist in the specified transaction coordinator. ]
OLE DB error trace [OLE/DB Provider 'SQLOLEDB' ITransactionJoin::JoinTransaction returned 0x8004d00a].
Msg 7391, Level 16, State 1, Line 2
The operation could not be performed because the OLE DB provider 'SQLOLEDB' was unable to begin a distributed transaction.


The only message the application gave me was the first line but I uncovered a sql command that let me test it without the overhead of running my application. Here is that command:


BEGIN DISTRIBUTED TRANSACTION
select * from linkname.databasename.dbo.tablename
COMMIT TRAN


I was able to get the inner command to run on its own. As a distributed transaction, it would fail every time.

I eventualy found this KB from microsoft to get me in the right direction. http://support.microsoft.com/kb/839279

You may receive a 7391 error message in SQL Server 2000 when you run a distributed transaction against a linked server after you install Windows Server 2003 or Windows XP Service Pack 2

It turned out to be a settings issue with the Microsoft Distributed Transation Cordinator (MS DTC). I had no idea what that realy was. It turns out MS DTC should be enabled when SQL is installed but by default its locked down (or is with 2003/xp sp2).

I used dcomcnfg to get to the component services. Console Root -> Component Services -> Computers -> My computer -> Properties . On the Default properties enable distributed COM on the computer. Under MSDTC -> Security Configuration check the option for Network DTC Access.

I did make those changes on both servers, I don't know if that was needed or not. I did the linked server last and it worked as soon as I applied the changes. I know those were the exact changes I made on the 2nd server. The first one prabably has every thing checked or opened up trying to figure it out.

Other thing you may try if having issues it to make sure the Distributed Transaction Coordinator is running as network service and not local system. Reinstalling it as a service and rebooting were a few other tips I read while trouble shooting this problem.