Results 1 to 15 of 15
  1. #1
    Join Date
    Mar 2007
    Posts
    167

    Unanswered: If you were faced with this dilema, what would you do?

    If you were faced with this dilema, what would you do?

    We had a Sybase ASE 12.5.4 server crash on us (stack traced and shutdown abruptly). Sybase Japan is telling us that it crashed because we are using network packet sizes larger than the "default network packet size". Our configuration is as follows...

    default network packet size ---- 1024 (bytes)
    max network packet size ------- 8192 (bytes)
    additional network memory ---- 37888000 (bytes)

    We are using network packet sizes of 1024 and 4096. As you can see from the above configuration, both packet sizes are within the allowable range (between "default network packet size" and "max network packet size").

    Sybase Japan has provided us with the below "Workarounds". From our findings (noted above), our environment is configured appropriately and accordingly to Sybase documentation. The below recommendations / workarounds from Sybase Japan completely contradict and go against Sybase documentation and seem to be outright wrong.

    If you were faced with this dilema, what would you do?


    Please read the following...

    ------------------------------------------------------------------------------------
    Sybase Japan Support Team wrote...
    ------------------------------------------------------------------------------------

    ? Workaround
    The followings are the workaround that can be expected to be effective at this time..

    1. Adjust the packet size of default network
    By increasing 'default network packet size' configuration(active) parameter up to the same value of 'max network packet size' configuration (active) parameter, the memory ensure in advance will be increased to make the additional memory unnecessary. In the result, it will not go through the logic caused the concerned phenomena.

    2. Large size of packet should not be used.
    The packet size used for Client application will be adjusted not to use a large packet size, and then which makes it unnecessary to obtain the additional memory. In the result, it will not go through the logic caused the concerned phenomena.

    3. Increase 'additional network memory' active parameter
    By increasing the size of additional memory of server wide, it may affect the timing of memory crash. In the result, it can be expected that the frequency of memory crash may be changed.

    These mesaures described above is just the method led from the result of analysis of Stack Trace, etc.. So, it may not be ensured by this method to prevent the failure completely (100 percent). If we want to find the complete workaround, it is required to reoccur the phenomena.
    ------------------------------------------------------------------------------------


    The below documentation completely contradicts their above recommendations / workarounds...

    ---------------------------
    - Sybase Documentation:
    ---------------------------
    Please use the following URLs (Sybase Manuals) for reference...

    Max Network Packet Size:
    http://infocenter.sybase.com/help/in...sag/X12540.htm

    Default Network Packet Size:
    http://infocenter.sybase.com/help/in...sag/X12319.htm

    Additional Network Memory:
    http://infocenter.sybase.com/help/in...sag/X65869.htm


    They claim that they have contacted the US Sybase Engineer Team and that the US Sybase Engineer Team agrees with their above recommendations / workarounds. I find this hard to believe. They will not provide any proof that US Sybase Engineer Team supports their recommendations / workarounds. They will not allow us to talk to the US Sybase Engineer Team directly. They claim that it's outside of our support agreement (i.e. we don't have a global contract). So we are stuck with their oppinion and technical advice.

    By the way... if their recommendations / workarounds are correct. This means that there is a fatal bug in Sybase ASE 12.5.4. This means that using different network packet sizes will cause the dataserver to experience catastrophic failure (i.e. fatal bug). We feel that if this this the case, it should be publically anounced to all customers that this fata bug exists.

    What are your thoughts?

  2. #2
    Join Date
    Mar 2007
    Posts
    167

    Potential serious / fatal bug (stack traces and abruptly does a shutdown)

    -----------------------------------------------------------------------------------
    Sybase Japan has recommended the following (a.k.a. workarounds )...
    -----------------------------------------------------------------------------------
    1) Increase the "default network packet size" configuration to the
    same size of "max network packet size" configuration.
    2) Do not use large packet sizes.
    3) Increase "additional network memory" configuration.
    -----------------------------------------------------------------------------------

    I could literally write on and on of all the reasons why this is bad technical advice. To sum it up.... This either bad technical advice or we have a serious / fatal bug on our hands (with Sybase ASE 12.5.4), or both. Potentially, we have a combination of both.

    If you sit down and think through their recommendations / workarounds, you will see that their workarounds don't address anything. If anything, their workarounds cause more, if not bigger problems. For example, workarounds #2.... "Do not use large packet sizes". Wow, I can't believe my eyes every time I read this workaround / recommendation. This is like telling a person who bought a 4X4 truck that they can only use 2 wheel drive. This would mean that the truck has serious defects. That is where we stand today with Sybase ASE 12.5.4 and this issue.

    If what they are saying is true, Sybase ASE 12.5.4 has a serious defect / fatal bug. The dataserver stack traces and abruptly does a shutdown. If this is the case, we need Sybase to officially classify this as a bug / defect and increase the urgency of this open case (the case has been open for nearly 2 months). Their workarounds and this issue has a large risk and impact to our production environment. I don't believe a banking / trading system can afford this type of risk.

  3. #3
    Join Date
    May 2005
    Location
    South Africa
    Posts
    1,365
    Provided Answers: 1
    For those interested, it was also posted on the sybase.public.ase.administration newsgroup

    It seems there are one of 3 options
    1) Will pre allocate the memory needed - Seems the best option
    2) Use the 512 default. This might not be an option. It might slow the app down, depending on what it is doing - needs testing.
    3) Might solve the problem by providing the additional memory required.

  4. #4
    Join Date
    Mar 2007
    Posts
    167

    Thank you...

    Thank you for your posting and great feedback. The information you have provided is great and think it will be very helpful for others. Thank you.

    To highlight a few points... So far, from Sybase investigation, it seems that the routine that is used by Sybase ASE 12.5.4 (and potentially other versions), which handles memory for packets larger than the 'default network packet size' setting/configuration may be the root cause of the problem. Very important note to keep in mind though. This has not been proven. This is just a theory.

    So with that in mind...

    - Option 1 and 2 will meet the requirement to preventing the routine being called.

    - Option 3 will not meet the requirement and still poses the risk of the routine being called.

    If the theory is correct, at this time, option 1 seems to be the best option. Option 2 could have large performance impact to applications/clients. Option 3 seems to hold the highest risk.

    Of course, this is based on the theory that a spececific routine is the culprit. This still needs to be confirmed by Sybase.


    -----------------------------------------------------------------------------------
    Sybase Japan has recommended the following (a.k.a. workarounds )...
    -----------------------------------------------------------------------------------
    1) Increase the "default network packet size" configuration to the
    same size of "max network packet size" configuration.
    2) Do not use large packet sizes.
    3) Increase "additional network memory" configuration.
    -----------------------------------------------------------------------------------

  5. #5
    Join Date
    Mar 2007
    Posts
    167

    Important Note: Sybase has officially withdrawn the workarounds recommendaed...

    Important Note: As of 10/31/2007 - Sybase has officially withdrawn the workarounds recommended for this case.

    Sybase is working hard and diligently to resolve and find the root cause. They will keep us all posted on their progress.

    Thank you.

  6. #6
    Join Date
    Jan 2004
    Posts
    545
    Provided Answers: 4
    Ftmjr: I assume you have an accountmanager or contact for your company at Sybase. If the Tech Support in Japan won't let you contact other Tech Supports, try to work this through wityh your accountmanager. He/she might be able to mediate between you and Sybase Japan.
    I'm not crazy, I'm an aeroplane!

  7. #7
    Join Date
    Mar 2007
    Posts
    167

    Thank you for your reply...

    Sybase Japan is being very responsive now and doing their best to work through the issues. They are working diligently to find the root cause and resolve.

    I will keep you posted on our progress.

    Thank you.

  8. #8
    Join Date
    Sep 2003
    Location
    Switzerland
    Posts
    443

    Stack Trace

    Do you have the stack trace from the errorlog which you can post here?

  9. #9
    Join Date
    Mar 2007
    Posts
    167

    stack trace (pat one)...

    02:00000:00671:2007/09/03 15:09:06.72 kernel current process (0x5e8d03ec) infected with 11
    02:00000:00671:2007/09/03 15:09:06.81 kernel Address 0x00000000808e2fc4 (kbfalloc+0xe4), siginfo (code, address) = (1, 0x6d6d616e64207770)
    02:00000:00671:2007/09/03 15:09:06.87 kernel Spinlocks held by kpid 1586299884

    02:00000:00671:2007/09/03 15:09:06.89 kernel Spinlock Network memory pool spinlock at address 0000010000488480 owned by 1586299884
    02:00000:00671:2007/09/03 15:09:06.89 kernel End of spinlock display.
    02:00000:00671:2007/09/03 15:09:06.91 kernel pc: 0x00000000809153a4 pcstkwalk+0x28(0x000001001a426608, 0x000001001a425d80, 0x000000000000270f, 0x000000
    0000000002, 0x0000000000000000)
    02:00000:00671:2007/09/03 15:09:06.91 kernel pc: 0x000000008091528c ucstkgentrace+0x1c8(0x000000000000270f, 0x000000005e8d03ec, 0x0000000000000000, 0x0
    0000100308fae08, 0x000001002a371bd8)
    02:00000:00671:2007/09/03 15:09:06.92 kernel pc: 0x00000000808b57bc ucbacktrace+0xb0(0x0000000000000000, 0x0000000000000001, 0x00000000808ca258, 0x0000
    000000000001, 0xffffffff7dc02000)
    02:00000:00671:2007/09/03 15:09:06.93 kernel pc: 0x000000008022e1a4 terminate_process+0x680(0x000001002a371bd8, 0xffffffffffffffff, 0x0000000000000000,
    0x0000000000000000, 0x0000000000000000)
    02:00000:00671:2007/09/03 15:09:06.93 kernel pc: 0x00000000808ddc40 kisignal+0x1c4(0x000000000001ac00, 0x000000000001ac00, 0x000001001a427140, 0x000000
    0081799800, 0x000001001bbffbf0)
    02:00000:00671:2007/09/03 15:09:07.07 kernel pc: 0xffffffff7dacd0bc __sighndlr+0xc(0x000000000000000b, 0x000001001a427420, 0x000001001a427140, 0x000000
    00808dda7c, 0x0000000000000000)
    02:00000:00671:2007/09/03 15:09:07.08 kernel pc: 0xffffffff7dac14a8 call_user_handler+0x3e0(0xffffffff7dc02000, 0xffffffff7dc02000, 0x000001001a427140,
    0x0000000000000008, 0x0000000000000000)
    02:00000:00671:2007/09/03 15:09:07.08 kernel pc: 0x00000000808e309c kbfalloc+0x1bc(0x00000100004b5800, 0x0000000000001000, 0x0000000000000000, 0x000000
    0000000000, 0x000001002a3727f0)
    02:00000:00671:2007/09/03 15:09:07.08 kernel pc: 0x0000000080923d48 varpkt_alloc_with_retry+0x40(0x00000100004b5800, 0x000001001a427738, 0x000000008102
    0800, 0x000001002a37c580, 0x0000000000000004)
    02:00000:00671:2007/09/03 15:09:07.08 kernel pc: 0x0000000080923604 usincpktsz+0x34(0x000001002a385ca8, 0x000001002a385cd8, 0x0000000000001000, 0x00000
    1001a4277f0, 0x0000000000001000)
    02:00000:00671:2007/09/03 15:09:07.08 kernel pc: 0x00000000802543b0 login_varpktsz+0x100(0x0000000000001000, 0x2020202020202020, 0xffffffffffffffff, 0x
    fffffffffffffff8, 0xffffffffffffffe0)
    02:00000:00671:2007/09/03 15:09:07.09 kernel [Handler pc: 0x000000008028cb0c hdl_backout installed by the following function:-]
    02:00000:00671:2007/09/03 15:09:07.10 kernel pc: 0x000000008024ffe0 login+0x1354(0x000001002a37c248, 0x0000000000400000, 0x0000000000009c00, 0x00000000
    0000a800, 0x000001002a37c3f8)
    02:00000:00671:2007/09/03 15:09:07.10 kernel [Handler pc: 0x000000008028cb0c hdl_backout installed by the following function:-]
    02:00000:00671:2007/09/03 15:09:07.10 kernel [Handler pc: 0x00000000804fd5a8 ut_handle installed by the following function:-]
    02:00000:00671:2007/09/03 15:09:07.10 kernel [Handler pc: 0x00000000804fd5a8 ut_handle installed by the following function:-]
    02:00000:00671:2007/09/03 15:09:07.11 kernel pc: 0x00000000801dd0a8 conn_hdlr+0x7c0(0x00000000000076b8, 0x0000000000001400, 0x000001002a37b974, 0x00000
    00000007400, 0xffffffffffffffff)
    02:00000:00671:2007/09/03 15:09:07.11 kernel pc: 0x0000000080929fe4 _coldstart(0x00000000000000fa, 0x00000000801dc8e8, 0x0000000000000000, 0x0000000000
    000000, 0x0000000000000000)
    02:00000:00671:2007/09/03 15:09:07.11 kernel end of stack trace, spid 671, kpid 1586299884, suid 4
    02:00000:00671:2007/09/03 15:09:07.11 kernel ueshutdown: exiting
    00:00000:00084:2007/09/03 15:09:56.92 kernel timeslice -501, current process infected
    01:00000:00155:2007/09/03 15:09:56.93 kernel timeslice -501, current process infected
    00:00000:00084:2007/09/03 15:09:56.92 kernel ************************************
    01:00000:00155:2007/09/03 15:09:56.93 kernel ************************************
    00:00000:00084:2007/09/03 15:09:56.94 kernel SQL causing error :
    01:00000:00155:2007/09/03 15:09:56.94 kernel SQL causing error :
    00:00000:00084:2007/09/03 15:09:56.94 kernel ************************************
    01:00000:00155:2007/09/03 15:09:56.94 kernel ************************************
    00:00000:00084:2007/09/03 15:09:56.94 server SQL Text:
    01:00000:00155:2007/09/03 15:09:56.94 server SQL Text:
    00:00000:00084:2007/09/03 15:09:56.94 kernel curdb = 1 tempdb = 0 pstat = 0x10000
    01:00000:00155:2007/09/03 15:09:56.94 kernel curdb = 1 tempdb = 0 pstat = 0x10000
    00:00000:00084:2007/09/03 15:09:56.94 kernel lasterror = 0 preverror = 0 transtate = 1
    01:00000:00155:2007/09/03 15:09:56.94 kernel lasterror = 0 preverror = 0 transtate = 1
    00:00000:00084:2007/09/03 15:09:56.94 kernel curcmd = 0 program = Mx1_SAM015_1
    01:00000:00155:2007/09/03 15:09:56.94 kernel curcmd = 0 program = Mx1_MXSERVICE_1
    01:00000:00155:2007/09/03 15:09:56.94 kernel pc: 0x00000000809153a4 pcstkwalk+0x28(0x000001001021d318, 0x000001001021ca90, 0x000000000000270f, 0x000000
    0000000002, 0x0000000000000000)
    00:00000:00084:2007/09/03 15:09:56.94 kernel pc: 0x00000000809153a4 pcstkwalk+0x28(0x0000010012788ff8, 0x0000010012788770, 0x000000000000270f, 0x000000
    0000000002, 0x0000000000000000)
    01:00000:00155:2007/09/03 15:09:56.94 kernel pc: 0x000000008091528c ucstkgentrace+0x1c8(0x000000000000270f, 0x000000005e91022e, 0x0000000000000000, 0x0
    000010030860b18, 0x0000010027a796f8)
    01:00000:00155:2007/09/03 15:09:56.94 kernel pc: 0x00000000808b57bc ucbacktrace+0xb0(0x0000000000000000, 0x0000000000000001, 0x000000000000001e, 0x0000
    010027a8d8b4, 0x0000010000000000)
    01:00000:00155:2007/09/03 15:09:56.94 kernel pc: 0x000000008022ebdc terminate_process+0x10b8(0x0000010027a796f8, 0xffffffffffffffff, 0x000001001021d980
    , 0xffffffff7dbf2e78, 0x000001001bdcd150)
    00:00000:00084:2007/09/03 15:09:56.94 kernel pc: 0x000000008091528c ucstkgentrace+0x1c8(0x000000000000270f, 0x000000005e8f0295, 0x0000000000000000, 0x0
    0000100308844d0, 0x00000100274d6470)
    00:00000:00084:2007/09/03 15:09:56.94 kernel pc: 0x00000000808b57bc ucbacktrace+0xb0(0x0000000000000000, 0x0000000000000001, 0x000000000000001e, 0x0000
    0100274ea62c, 0x0000010000000000)
    00:00000:00084:2007/09/03 15:09:56.94 kernel pc: 0x000000008022ebdc terminate_process+0x10b8(0x00000100274d6470, 0xffffffffffffffff, 0x0000010012789660
    , 0xffffffff7dbf2e78, 0x000001001bb97b40)
    03:00000:00902:2007/09/03 15:09:56.97 kernel timeslice -501, current process infected
    03:00000:00902:2007/09/03 15:09:56.97 kernel

  10. #10
    Join Date
    Mar 2007
    Posts
    167

    stack trace (pat two)...

    ************************************
    03:00000:00902:2007/09/03 15:09:56.97 kernel SQL causing error :
    03:00000:00902:2007/09/03 15:09:56.97 kernel ************************************
    03:00000:00902:2007/09/03 15:09:56.97 server SQL Text:
    03:00000:00902:2007/09/03 15:09:56.97 kernel curdb = 1 tempdb = 0 pstat = 0x10000
    03:00000:00902:2007/09/03 15:09:56.97 kernel lasterror = 0 preverror = 0 transtate = 1
    03:00000:00902:2007/09/03 15:09:56.98 kernel curcmd = 0 program = Mx1_IRDTM006_1
    03:00000:00902:2007/09/03 15:09:56.98 kernel pc: 0x00000000809153a4 pcstkwalk+0x28(0x00000100179a4c38, 0x00000100179a43b0, 0x000000000000270f, 0x000000
    0000000002, 0x0000000000000000)
    03:00000:00902:2007/09/03 15:09:56.98 kernel pc: 0x000000008091528c ucstkgentrace+0x1c8(0x000000000000270f, 0x000000005e930377, 0x0000000000000000, 0x0
    0000100308d26e0, 0x000001002b5c9160)
    03:00000:00902:2007/09/03 15:09:56.98 kernel pc: 0x00000000808b57bc ucbacktrace+0xb0(0x0000000000000000, 0x0000000000000001, 0x000000000000001e, 0x0000
    01002b5dd31c, 0x0000010000000000)
    03:00000:00902:2007/09/03 15:09:56.98 kernel pc: 0x000000008022ebdc terminate_process+0x10b8(0x000001002b5c9160, 0xffffffffffffffff, 0x00000100179a52a0
    , 0xffffffff7dbf2e78, 0x000001001bc127a0)
    01:00000:00155:2007/09/03 15:09:56.98 kernel pc: 0xffffffff7dacd0bc __sighndlr+0xc(0x0000000000000010, 0x000001001021dc60, 0x000001001021d980, 0x000000
    00808de0a8, 0x0000000000000000)
    01:00000:00155:2007/09/03 15:09:56.98 kernel pc: 0xffffffff7dac14a8 call_user_handler+0x3e0(0xffffffff7dc02000, 0xffffffff7dc02000, 0x000001001021d980,
    0x0000000000000008, 0x0000000000000000)
    01:00000:00155:2007/09/03 15:09:56.98 kernel pc: 0x00000000808e309c kbfalloc+0x1bc(0x00000100004b5800, 0x0000000000001000, 0x0000000000000000, 0x000000
    0000000000, 0x0000010027a7a310)
    01:00000:00155:2007/09/03 15:09:56.98 kernel pc: 0x0000000080923d48 varpkt_alloc_with_retry+0x40(0x00000100004b5800, 0x000001001021df78, 0x000000008102
    0800, 0x0000010027a840a0, 0x0000000000000004)
    01:00000:00155:2007/09/03 15:09:56.98 kernel pc: 0x00000000809235f0 usincpktsz+0x20(0x0000010027a8d7c8, 0x0000010027a8d7f8, 0x0000000000001000, 0x00000
    1001021e030, 0x0000000000001000)
    01:00000:00155:2007/09/03 15:09:56.98 kernel pc: 0x00000000802543b0 login_varpktsz+0x100(0x0000000000001000, 0x2020202020202020, 0xffffffffffffffff, 0x
    fffffffffffffff8, 0xffffffffffffffe0)
    01:00000:00155:2007/09/03 15:09:56.98 kernel [Handler pc: 0x000000008028cb0c hdl_backout installed by the following function:-]
    01:00000:00155:2007/09/03 15:09:56.98 kernel pc: 0x000000008024ffe0 login+0x1354(0x0000010027a83d68, 0x0000000000400000, 0x0000000000009c00, 0x00000000
    0000a800, 0x0000010027a83f18)
    01:00000:00155:2007/09/03 15:09:56.98 kernel [Handler pc: 0x000000008028cb0c hdl_backout installed by the following function:-]
    01:00000:00155:2007/09/03 15:09:56.98 kernel [Handler pc: 0x00000000804fd5a8 ut_handle installed by the following function:-]
    01:00000:00155:2007/09/03 15:09:56.98 kernel [Handler pc: 0x00000000804fd5a8 ut_handle installed by the following function:-]
    01:00000:00155:2007/09/03 15:09:56.98 kernel pc: 0x00000000801dd0a8 conn_hdlr+0x7c0(0x00000000000076b8, 0x0000000000001400, 0x0000010027a83494, 0x00000
    00000007400, 0xffffffffffffffff)
    01:00000:00155:2007/09/03 15:09:56.98 kernel pc: 0x0000000080929fe4 _coldstart(0x000000000000011c, 0x00000000801dc8e8, 0x0000000000000000, 0x0000000000
    000000, 0x0000000000000000)
    01:00000:00155:2007/09/03 15:09:56.98 kernel end of stack trace, spid 155, kpid 1586561582, suid 4
    00:00000:00084:2007/09/03 15:09:57.01 kernel pc: 0xffffffff7dacd0bc __sighndlr+0xc(0x0000000000000010, 0x0000010012789940, 0x0000010012789660, 0x000000
    00808de0a8, 0x0000000000000000)
    00:00000:00084:2007/09/03 15:09:57.01 kernel pc: 0xffffffff7dac14a8 call_user_handler+0x3e0(0xffffffff7dc02000, 0xffffffff7dc02000, 0x0000010012789660,
    0x0000000000000008, 0x0000000000000000)
    00:00000:00084:2007/09/03 15:09:57.01 kernel pc: 0x00000000808e309c kbfalloc+0x1bc(0x00000100004b5800, 0x0000000000001000, 0x0000000000000000, 0x000000
    0000000000, 0x00000100274d7088)
    00:00000:00084:2007/09/03 15:09:57.01 kernel pc: 0x0000000080923d48 varpkt_alloc_with_retry+0x40(0x00000100004b5800, 0x0000010012789c58, 0x000000008102
    0800, 0x00000100274e0e18, 0x0000000000000004)
    00:00000:00084:2007/09/03 15:09:57.01 kernel pc: 0x00000000809235f0 usincpktsz+0x20(0x00000100274ea540, 0x00000100274ea570, 0x0000000000001000, 0x00000
    10012789d10, 0x0000000000001000)
    00:00000:00084:2007/09/03 15:09:57.01 kernel pc: 0x00000000802543b0 login_varpktsz+0x100(0x0000000000001000, 0x2020202020202020, 0xffffffffffffffff, 0x
    fffffffffffffff8, 0xffffffffffffffe0)
    00:00000:00084:2007/09/03 15:09:57.01 kernel [Handler pc: 0x000000008028cb0c hdl_backout installed by the following function:-]
    00:00000:00084:2007/09/03 15:09:57.01 kernel pc: 0x000000008024ffe0 login+0x1354(0x00000100274e0ae0, 0x0000000000400000, 0x0000000000009c00, 0x00000000
    0000a800, 0x00000100274e0c90)
    00:00000:00084:2007/09/03 15:09:57.01 kernel [Handler pc: 0x000000008028cb0c hdl_backout installed by the following function:-]
    00:00000:00084:2007/09/03 15:09:57.01 kernel [Handler pc: 0x00000000804fd5a8 ut_handle installed by the following function:-]
    00:00000:00084:2007/09/03 15:09:57.01 kernel [Handler pc: 0x00000000804fd5a8 ut_handle installed by the following function:-]
    00:00000:00084:2007/09/03 15:09:57.01 kernel pc: 0x00000000801dd0a8 conn_hdlr+0x7c0(0x00000000000076b8, 0x0000000000001400, 0x00000100274e020c, 0x00000
    00000007400, 0xffffffffffffffff)
    00:00000:00084:2007/09/03 15:09:57.01 kernel pc: 0x0000000080929fe4 _coldstart(0x000000000000010f, 0x00000000801dc8e8, 0x0000000000000000, 0x0000000000
    000000, 0x0000000000000000)
    00:00000:00084:2007/09/03 15:09:57.01 kernel end of stack trace, spid 84, kpid 1586430613, suid 4
    03:00000:00902:2007/09/03 15:09:57.03 kernel pc: 0xffffffff7dacd0bc __sighndlr+0xc(0x0000000000000010, 0x00000100179a5580, 0x00000100179a52a0, 0x000000
    00808de0a8, 0x0000000000000000)
    03:00000:00902:2007/09/03 15:09:57.03 kernel pc: 0xffffffff7dac14a8 call_user_handler+0x3e0(0xffffffff7dc02000, 0xffffffff7dc02000, 0x00000100179a52a0,
    0x0000000000000008, 0x0000000000000000)
    03:00000:00902:2007/09/03 15:09:57.03 kernel pc: 0x00000000808e309c kbfalloc+0x1bc(0x00000100004b5800, 0x0000000000001000, 0x0000000000000000, 0x000000
    0000000000, 0x000001002b5c9d78)
    03:00000:00902:2007/09/03 15:09:57.03 kernel pc: 0x0000000080923d48 varpkt_alloc_with_retry+0x40(0x00000100004b5800, 0x00000100179a5898, 0x000000008102
    0800, 0x000001002b5d3b08, 0x0000000000000004)
    03:00000:00902:2007/09/03 15:09:57.03 kernel pc: 0x00000000809235f0 usincpktsz+0x20(0x000001002b5dd230, 0x000001002b5dd260, 0x0000000000001000, 0x00000
    100179a5950, 0x0000000000001000)
    03:00000:00902:2007/09/03 15:09:57.03 kernel pc: 0x00000000802543b0 login_varpktsz+0x100(0x0000000000001000, 0x2020202020202020, 0xffffffffffffffff, 0x
    fffffffffffffff8, 0xffffffffffffffe0)
    03:00000:00902:2007/09/03 15:09:57.03 kernel [Handler pc: 0x000000008028cb0c hdl_backout installed by the following function:-]
    03:00000:00902:2007/09/03 15:09:57.03 kernel pc: 0x000000008024ffe0 login+0x1354(0x000001002b5d37d0, 0x0000000000400000, 0x0000000000009c00, 0x00000000
    0000a800, 0x000001002b5d3980)
    03:00000:00902:2007/09/03 15:09:57.03 kernel [Handler pc: 0x000000008028cb0c hdl_backout installed by the following function:-]
    03:00000:00902:2007/09/03 15:09:57.03 kernel [Handler pc: 0x00000000804fd5a8 ut_handle installed by the following function:-]
    03:00000:00902:2007/09/03 15:09:57.03 kernel [Handler pc: 0x00000000804fd5a8 ut_handle installed by the following function:-]
    03:00000:00902:2007/09/03 15:09:57.03 kernel pc: 0x00000000801dd0a8 conn_hdlr+0x7c0(0x00000000000076b8, 0x0000000000001400, 0x000001002b5d2efc, 0x00000
    00000007400, 0xffffffffffffffff)
    03:00000:00902:2007/09/03 15:09:57.03 kernel pc: 0x0000000080929fe4 _coldstart(0x0000000000000113, 0x00000000801dc8e8, 0x0000000000000000, 0x0000000000
    000000, 0x0000000000000000)
    03:00000:00902:2007/09/03 15:09:57.03 kernel end of stack trace, spid 902, kpid 1586692983, suid 4

  11. #11
    Join Date
    Sep 2003
    Location
    Switzerland
    Posts
    443
    Theres a similar case with 12.5.3, but you are saying your version is 12.5.4. It looks like a potential bug which re-faced from what I can see.

    CR # 370870 has been fixed and will be included in ASE 12.5.3/esd#6

    So, until Sybase finds a patch for this, I would suggest working with them and providing them whatever info they want.

    Did they give you any new workarounds to fix the issue since you said they withdrew the old ones? Try to check the new workarounds they give you and see if any of it fixes the issue for the time being as you dont want a prod server crashing when Sybase is meanwhile trying to get to the root cause

    Not much any help but try to find an interim solution. At times, you may even have to sacrifice performance to get a stable environment.

  12. #12
    Join Date
    Sep 2003
    Location
    Switzerland
    Posts
    443
    Did you ever find the root cause for this?

  13. #13
    Join Date
    Mar 2007
    Posts
    167

    Unfortunately, there are no new workarounds or news updates about fixes. :(

    Unfortunately, there are no new workarounds or new updates about fixes.

    This issue just got stuck in the mud. No progress has been made since... other than setting up a CSMD (Shared Memory Dump) for just in case it happens again in the future.

    Unfortunately, they have not been able to successfully reproduced it in a Reproduction Environment. Meaning, have not been able to prove their theory(s).

    As you mentioned in your prior posting, we are using ASE 12.5.4 and no similar case has been found to date that matches.

    I guess we can be happy that we have not faced the outage again / since (keep in mind we did not implement the old / proposed workarounds). With the exception of the Memory Dump of course.

    We'll keep you posted if anything changes.

  14. #14
    Join Date
    Sep 2003
    Location
    Switzerland
    Posts
    443
    Thanks ftmjr. That is unfortunate.

    We had a similar issue in 12.5.1 this week, but not the kbfalloc function. It was ubffree. Might have to raise a case with them.

    04:00000:00890:2008/01/08 04:11:33.34 kernel Spinlocks held by kpid 1249706609



    04:00000:00890:2008/01/08 04:11:33.34 kernel Spinlock Network memory pool spinlock at address 0000000200a0bf40 owned by 1249706609

    04:00000:00890:2008/01/08 04:11:33.34 kernel End of spinlock display.

    04:00000:00890:2008/01/08 04:11:33.34 kernel pc: 0x000000008083eea4 pcstkwalk+0x28(0x000000020650f058, 0x000000020650e7d0, 0x000000000000270f, 0x0000000000000002, 0xfffffffffffffff8)

    04:00000:00890:2008/01/08 04:11:33.34 kernel pc: 0x000000008083ed8c ucstkgentrace+0x1c8(0x000000000000270f, 0x000000004a7d0271, 0x0000000000000000, 0x00000002184aebf0, 0x00000002154122f8)

    04:00000:00890:2008/01/08 04:11:33.34 kernel pc: 0x00000000807f1a30 ucbacktrace+0xb0(0x0000000000000000, 0x0000000000000001, 0x0000000080802d0c, 0x0000000000000001, 0x00000000000000d8)

    04:00000:00890:2008/01/08 04:11:33.34 kernel pc: 0x00000000802191b8 terminate_process+0x5e4(0x0000000000000000, 0xffffffffffffffff, 0x0000000000000000, 0x000000020650f450, 0x0000000000000001)

    04:00000:00890:2008/01/08 04:11:33.34 kernel pc: 0x000000008080bd34 kisignal+0x16c(0x000000000000000b, 0x000000020650fe80, 0x000000020650fba0, 0x0000000000000000, 0x0000000000000000)

    04:00000:00890:2008/01/08 04:11:33.34 kernel pc: 0xffffffff7f01768c _setitimer+0xb4(0x000000000000000b, 0x000000020650fe80, 0x000000020650fba0, 0x000000008080bbc8, 0x0000000000000000)

    04:00000:00890:2008/01/08 04:11:33.34 kernel pc: 0xffffffff7f0111d4 sema_post+0x530(0x000000000000000b, 0xffffffff7da00000, 0x0000000000000001, 0xffffffff7f11cba0, 0xffffffff7f11ca40)

    04:00000:00890:2008/01/08 04:11:33.34 kernel pc: 0xffffffff7f0113c0 sema_post+0x71c(0xffffffff7da00000, 0x000000020650fba0, 0xffffffff7f11a000, 0x000000020650fe80, 0x000000020650fe80)

    04:00000:00890:2008/01/08 04:11:33.35 kernel pc: 0x0000000080811cb4 ubffree+0x28c(0x0000000200a23000, 0x0000000201688150, 0xffffffffffffffff, 0x000000021541cbf0, 0x0000000000000004)

    04:00000:00890:2008/01/08 04:11:33.35 kernel pc: 0x000000008084c610 usincpktsz+0x148(0x0000000200ae1710, 0x0000000000001400, 0x0000000000001400, 0x00000002065101a0, 0x0000000000001400)

    04:00000:00890:2008/01/08 04:11:33.35 kernel pc: 0x000000008023da08 login_varpktsz+0xf8(0x0000000215426218, 0x0000000020202020, 0xffffffffffffffff, 0xfffffffffffffff8, 0x0000000000000000)

  15. #15
    Join Date
    Mar 2007
    Posts
    167

    CSMD (shared memory dump)

    I apologize if you already know how to do this or have information on how to do this. I thought I would pass it on just in case and hope that it is helpful. The below URL helps describe how to set up a CSMD (shared memory dump).

    http://infocenter.sybase.com/help/in.../svrtsg178.htm

    At least this will be the first thing Sybase will ask you to do when you open a case with them. This won't prevent the issue from happening again, but may be helpful in troubleshooting if it does.

    Hope this helps.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •