Wednesday, March 28, 2012

Strange bug: application dies (partially) after restart

We are observing a strange phenomenon on our ASP .NET applications:
If the applications are cold started (after a server reboot or an II
restart), they run for days without any problem.
But if they get restarted while IIS is running (because the web.config file
got modified, or because the DLLs got replaced by new ones), they run ok for
a certain period (a few hours if the load is light, only a few minutes if
the load is heavier) and then the server stops responding to some requests
(seems like it continues to generate stateless pages, but does not generate
any stateful pages any more, but I have to investigate what is really
happening in more details).
We observed this strange behavior on our two ASP .NET applications, and on
different servers. The applications are based on .NET Framework 1.1 and
Visual J#.
This is not critical because we can restart IIS every time we modify the
config file or upgrade the DLLs but I'd like to understand what is going on.
Did anyone else experience a similar problem? Is there a fix?
Also, it would be great if there was a way to prevent ASP NET from
restarting the application every time the config file or the DLLs are
modified. Some programs (anti-virus, backup) change the last access time on
these files and they cause an application restart (accessing the file is all
it takes, the application restarts even if the files are not modified),
which is very annoying (especially if the application stops responding
afterwards, as described above).
This automatic restart feature is nice during development but I don't find
it so nice on a production server because I don't like the idea of having a
Web App restart because some background maintenance task is touching its
files. Does anybody know how to prevent this?
Bruno.The restart is key to having the application object be more robust. It's not
something that is going to get removed. You can clamor for an option switch
though which may turn it off.
What does your event logs say during these funny errors?
Regards,
Alvin Bruney [ASP.NET MVP]
Got tidbits? Get it here...
http://tinyurl.com/3he3b
"Bruno Jouhier [MVP]" <bjouhier@.club-internet.fr> wrote in message
news:O4ARZfWFEHA.2560@.TK2MSFTNGP12.phx.gbl...
> We are observing a strange phenomenon on our ASP .NET applications:
> If the applications are cold started (after a server reboot or an II
> restart), they run for days without any problem.
> But if they get restarted while IIS is running (because the web.config
file
> got modified, or because the DLLs got replaced by new ones), they run ok
for
> a certain period (a few hours if the load is light, only a few minutes if
> the load is heavier) and then the server stops responding to some requests
> (seems like it continues to generate stateless pages, but does not
generate
> any stateful pages any more, but I have to investigate what is really
> happening in more details).
> We observed this strange behavior on our two ASP .NET applications, and on
> different servers. The applications are based on .NET Framework 1.1 and
> Visual J#.
> This is not critical because we can restart IIS every time we modify the
> config file or upgrade the DLLs but I'd like to understand what is going
on.
> Did anyone else experience a similar problem? Is there a fix?
> Also, it would be great if there was a way to prevent ASP NET from
> restarting the application every time the config file or the DLLs are
> modified. Some programs (anti-virus, backup) change the last access time
on
> these files and they cause an application restart (accessing the file is
all
> it takes, the application restarts even if the files are not modified),
> which is very annoying (especially if the application stops responding
> afterwards, as described above).
> This automatic restart feature is nice during development but I don't find
> it so nice on a production server because I don't like the idea of having
a
> Web App restart because some background maintenance task is touching its
> files. Does anybody know how to prevent this?
> Bruno.
>
Hi Alvin,
Thanks for the reply. Our logs don't say anything. Looks like the stateful
requests are not even delivered to the application any more because
otherwise we would either be getting a response, or an exception (and we log
them all). But I need to attach a debugger to verify this and see what's
really going on inside. So, I need to do more investigation on my side, but
I was wondering if somebody had experienced something similar.
Bruno.
"Alvin Bruney [MVP]" <vapor at steaming post office> a crit dans le mes
sage
de news:Ow$vzWaFEHA.576@.TK2MSFTNGP11.phx.gbl...
> The restart is key to having the application object be more robust. It's
not
> something that is going to get removed. You can clamor for an option
switch
> though which may turn it off.
> What does your event logs say during these funny errors?
> --
> Regards,
> Alvin Bruney [ASP.NET MVP]
> Got tidbits? Get it here...
> http://tinyurl.com/3he3b
> "Bruno Jouhier [MVP]" <bjouhier@.club-internet.fr> wrote in message
> news:O4ARZfWFEHA.2560@.TK2MSFTNGP12.phx.gbl...
> file
> for
if
requests
> generate
on
> on.
> on
> all
find
having
> a
>
Hi Alvin,
From the symptom you described, the problem only occured after the ASP.NET
application recycled at runtime and concerned with those stateful pages?
Does the "Stateful page" you mentioned means "posted back" page or in other
word, the posted back requests no longer be processed after the recycling?
I'm wondering whether this problem also occur if you make a simple web
application on the same server. If not occur, we can isolate the problem on
the certain application and some further throubshooting are needed.
Also, I recommend that you add some trace code in those page(suffering the
problem)'s code , such as write some debug info into a custom log file in
different events of the page. Thus, it'll help to confirm when the problem
acually happend.
Any way, if you have any newfindings or updates, please feel free to post
here. Thanks.
Regards,
Steven Cheng
Microsoft Online Support
Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)
Get Preview at ASP.NET whidbey
http://msdn.microsoft.com/asp.net/whidbey/default.aspx
Hi Steve,
The pages that seem affected are all the pages that have a "session state".
Our server is generating both stateless (no session) and stateful (session)
pages. And it looks like the stateless pages continue to be generated ok but
the stateful ones don't.
Unfortunately, this is a pretty big application and I cannot send you a
repro sample at this point. Also, this is an unusual ASP .NET application
because we interface directly with System.Web and we use a proprietary
rendering engine (we don't use the ASP .NET resources/controls -- we have
special requirements, for example the ability to modify the theme/layout of
the pages at run-time). So, the bug may show up in our case but not in
standard ASP .NET.
What I find really strange is that there is a difference between a cold
start and a hot restart (when the config file or the DLLs are touched). In
the first case, the app works ok, and in the second case it has serious
problems. I was expecting the hot restart to start the application in the
same state as a cold start (basically with a clean AppDomain) but this does
not seem to be completely true.
Anyway, I need to do some more research (reproducing the production server
behavior on dev machines, adding logs, investigating with debugger) to
understand what is really going on. I'll let you know what I find. My post
was more to see if someone else had experienced a similar behavior or had a
quick answer, to avoid some time consuming investigations.
Bruno.
"Steven Cheng[MSFT]" <v-schang@.online.microsoft.com> a crit dans le mes
sage
de news:kMlMDUgFEHA.1160@.cpmsftngxa06.phx.gbl...
> Hi Alvin,
> From the symptom you described, the problem only occured after the ASP.NET
> application recycled at runtime and concerned with those stateful pages?
> Does the "Stateful page" you mentioned means "posted back" page or in
other
> word, the posted back requests no longer be processed after the recycling?
> I'm wondering whether this problem also occur if you make a simple web
> application on the same server. If not occur, we can isolate the problem
on
> the certain application and some further throubshooting are needed.
> Also, I recommend that you add some trace code in those page(suffering the
> problem)'s code , such as write some debug info into a custom log file in
> different events of the page. Thus, it'll help to confirm when the problem
> acually happend.
> Any way, if you have any newfindings or updates, please feel free to post
> here. Thanks.
>
> Regards,
> Steven Cheng
> Microsoft Online Support
> Get Secure! www.microsoft.com/security
> (This posting is provided "AS IS", with no warranties, and confers no
> rights.)
> Get Preview at ASP.NET whidbey
> http://msdn.microsoft.com/asp.net/whidbey/default.aspx
>
I did more investigation, and the problem is due to a deadlock (I 'm posting
the issue separately on the VJ# NG because it seems to be specific to J#).
The fact that it only seemed to happen after a hot restart was only a
coincidence. Since my initial post, we also saw the problem after a cold
start.
Bruno
"Bruno Jouhier [MVP]" <bjouhier@.club-internet.fr> a crit dans le messag
e de
news:eFeXAEdFEHA.700@.TK2MSFTNGP09.phx.gbl...
> Hi Alvin,
> Thanks for the reply. Our logs don't say anything. Looks like the stateful
> requests are not even delivered to the application any more because
> otherwise we would either be getting a response, or an exception (and we
log
> them all). But I need to attach a debugger to verify this and see what's
> really going on inside. So, I need to do more investigation on my side,
but
> I was wondering if somebody had experienced something similar.
> Bruno.
> "Alvin Bruney [MVP]" <vapor at steaming post office> a crit dans le
message
> de news:Ow$vzWaFEHA.576@.TK2MSFTNGP11.phx.gbl...
> not
> switch
ok
> if
> requests
and
> on
and
the
going
time
is
modified),
> find
> having
its
>
Hi Bruno,
Thanks for your followup. I'm glad that you've firgured out the problem.
Since the problem is specific to J#, have you found any means to workaround
it? Hope you'll soon thoroughly resolve this problem. If there is anything
I can help, please feel free to post here.Thanks.
Regards,
Steven Cheng
Microsoft Online Support
Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)
Get Preview at ASP.NET whidbey
http://msdn.microsoft.com/asp.net/whidbey/default.aspx
Hi Steven,
The problem is in the J# Date, SimpleDateFormat and TimeZone classes.
Synchronization seems to be seriously broken and I have a number of
tracebacks and a small repro that demonstrate this (I'm going to post more
stuff on the J# newsgroup).
First, I have tried to fix it by synchronizing around a global object before
calling these JDK methods. But this only works if I have one application
hosted in aspnet_wp.exe. With 2 applications, I get deadlocks between
threads from the 2 apps (very weird, I have posted more details on this one
on microsoft.public.dotnet.vjsharp).
So, I am in the process of recoding all our date and time stuff with the
.NET APIs. This should fix it, and it should also be faster (fortunately, w
e
had our own Date and Timestamp classes and everything is encapsulated there,
so the recoding is very local).
BTW: The .NET stuff is very good overall but it is rather poor when it comes
to TimeZones. .NET only knows about the machine's timezone and UTC time.
This is not enough for a server app where you want every user to see and
input dates and times formatted in his own timezome. I posted a suggestion
on the Whidbey NG to get extensions to this: APIs to enumerate time zones,
find them by name, and also a "thread" variable that would keep track of the
current timezone and influence the way DateTime are interpreted in the
current thread (something like TimeZone.CurrentTimeZone that would work very
much like CultureInfo.CurrentCulture). I'm repeating it here because I think
that this is a real hole in the .NET API, and an API like that would make my
life easier.
Bruno.
"Steven Cheng[MSFT]" <v-schang@.online.microsoft.com> a crit dans le mes
sage
de news:Y7LgaErGEHA.3244@.cpmsftngxa06.phx.gbl...
> Hi Bruno,
> Thanks for your followup. I'm glad that you've firgured out the problem.
> Since the problem is specific to J#, have you found any means to
workaround
> it? Hope you'll soon thoroughly resolve this problem. If there is anything
> I can help, please feel free to post here.Thanks.
>
> Regards,
> Steven Cheng
> Microsoft Online Support
> Get Secure! www.microsoft.com/security
> (This posting is provided "AS IS", with no warranties, and confers no
> rights.)
> Get Preview at ASP.NET whidbey
> http://msdn.microsoft.com/asp.net/whidbey/default.aspx
>
Hi Steven,
I have posted the repro on the microsoft.public.dotnet.vjsharp NG because it
is really specific to J#.
I will also post some tracebacks of other deadlocks that I got on the J# NG
but I'll do it later because I don't have time to format and explain them
right now.
The deadlock that I have posted is really very very simple: one thread calls
TimeZone.getDefault() and the other one calls Calendar.getInstance(), and
they manage to deadlock each other (not always but at least one run out of
4).
Bruno.
"Steven Cheng[MSFT]" <v-schang@.online.microsoft.com> a crit dans le mes
sage
de news:MU4eTS7GEHA.2224@.cpmsftngxa06.phx.gbl...
> Hi Bruno,
> Thanks for your followup. Since you mentioned that you've generated a
small
> repro, I think it'll be helpful if you can make the repro contains the
> necessary stuffs and attached it to us. Also a definite repro steps is
also
> necessary, thus, we can try to do some indepth research on ourside.
> In addtion, I also have met the TimeZone problems of the .NET framework
> classes before since it doesn't provide enought interfaces to process
> TimeZone with client cultureinfo. Sometimes we even need to build our own
> custom LookUp Table to mapping cultureinfo with certain TimeZone info.
> Anyway, we'll also forward this suggestion to the related production team.
> Hope it will soon be improved in the following release. Thanks.
> Regards,
> Steven Cheng
> Microsoft Online Support
> Get Secure! www.microsoft.com/security
> (This posting is provided "AS IS", with no warranties, and confers no
> rights.)
> Get Preview at ASP.NET whidbey
> http://msdn.microsoft.com/asp.net/whidbey/default.aspx
>
Hi Bruno,
Thanks for your reply. From your further description, I've got that the
problem is specific to J#. And I hope you'll soon resolve the problem.
Also, if you think there is anything we still can help, please feel free
post here or mail me via the mail address in my signature(remove the
"online"). Thanks.
Regards,
Steven Cheng
Microsoft Online Support
Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)
Get Preview at ASP.NET whidbey
http://msdn.microsoft.com/asp.net/whidbey/default.aspx

0 comments:

Post a Comment