[tahoe-lafs-trac-stream] [Tahoe-LAFS] #2365: cloud backend: download from S4 hangs on a particular file

Tahoe-LAFS trac at tahoe-lafs.org
Tue Jan 20 22:41:45 UTC 2015


#2365: cloud backend: download from S4 hangs on a particular file
-------------------------+-------------------------------------------------
     Reporter:  daira    |      Owner:
         Type:  defect   |     Status:  new
     Priority:  normal   |  Milestone:  undecided
    Component:  unknown  |    Version:  cloud-branch
   Resolution:           |   Keywords:  s4 cloud-backend download tahoe-
Launchpad Bug:           |  backup
-------------------------+-------------------------------------------------
Description changed by daira:

Old description:

> Rafal Jedruszek reported this problem with S4.
>
> I cannot download one of my files. Just nothing is happening on download
> trial. Status screen shows 0.0% progress and "fetching segment 0".
>
> Logs are:
> starting..
> Connecting..
> Connected (to pid 915)
> Remote Versions:
> Nevow: 0.11.1
> Twisted: 14.0.2
> allmydata-tahoe: 1.10.0
> foolscap: 0.6.5
> mock: 1.0.1
> platform: Linux-debian_8.0-x86_64-64bit_ELF
> pyOpenSSL: 0.14
> pyasn1: 0.1.7
> pycrypto: 2.6.1
> pycryptopp: 0.6.0.1206569328141510525648634803928199668821045408958
> python: 2.7.8
> setuptools: 5.5.1
> simplejson: 3.6.5
> twisted: 14.0.2
> zfec: 1.4.5
> zope.interface: unknown
> 23:43:38.715 L20 []#1704 Immutable.DownloadNode(vukxrvahqlwk) created:
> size=4400830, guessed_segsize=131072, guessed_numsegs=34
> 23:43:38.715 L10 []#1705 ShareFinder[si=vukxrvahqlwk] starting
> 23:43:38.715 L20 []#1706 imm Node(vukxrvah).read(0, 4400830)
> 23:43:38.715 L10 []#1707 _fetch_next(offset=0) probably wants segnum=0
> 23:43:38.715 L20 []#1708 imm Node(vukxrvah).get_segment(0)
> 23:43:38.715 L10 []#1709
> ImmutableDownloadNode(vukxrvahqlwk)._start_new_segment: segnum=0
> 23:43:38.728 L10 []#1710 ShareFinder[si=vukxrvahqlwk] hungry
> 23:43:38.731 L10 []#1711 ShareFinder loop: running=True hungry=True,
> pending=
> 23:43:38.731 L10 []#1712 sending DYHB to [e7ph56p4]
> 23:43:38.736 L10 []#1713 ShareFinder loop: running=True hungry=True,
> pending=e7ph56p4
> 23:43:48.732 L10 []#1714 ShareFinder loop: running=True hungry=True,
> pending=e7ph56p4
>
> And it stays this way.
>
> On upload try also - nothing is happening. Web page shows "hash: 100.0%,
> ciphertext: 0.0%, encode: 0.0%", "Status: Contacting Servers [e7ph56p4]
> (first query), 0 shares left..".
>
> Logs:
> 23:50:06.576 L20 []#2917 SharemapUpdater(ootig): starting (MODE_READ)
> 23:50:06.576 L10 []#2918 sending query to [e7ph56p4], readsize=4000
> 23:50:06.867 L20 []#2919 got result from [e7ph56p4], 1 shares
> 23:50:06.867 L10 []#2920 _got_results done
> 23:50:06.871 L10 []#2921 _got_results: got shnum #0 from serverid
> e7ph56p4
> 23:50:06.871 L20 []#2922 found valid version 6-uif6 from e7ph56p4-sh0:
> 1-1/325/325
> 23:50:06.871 L10 []#2923 _check_for_done, mode is 'MODE_READ', 0 queries
> outstanding, 0 extra servers available, 0 'must query' servers left,
> need_privkey=False
> 23:50:06.871 L20 []#2924 all queries are retired, no extra servers: done
> 23:50:06.871 L20 []#2925 servermap: 1*seq6-uif6
> 23:50:06.871 L10 []#2926 _check_for_done, mode is 'MODE_READ', 0 queries
> outstanding, 0 extra servers available, 0 'must query' servers left,
> need_privkey=False
> 23:50:06.871 L10 []#2927 but we're not running
> 23:50:06.884 L20 []#2928 Retrieve(ootig): starting
> 23:50:06.884 L20 []#2929 got seqnum 6
> 23:50:06.885 L20 []#2930 got encoding parameters: k: 1 n: 1 1 segments of
> 325 bytes each (325 byte tail segment)
> 23:50:06.885 L20 []#2931 starting download
> 23:50:06.891 L20 []#2932 adding 1 new servers to the active list
> 23:50:06.891 L20 []#2933 added reader for share 0
> 23:50:06.891 L20 []#2934 on segment 1 of 1
> 23:50:06.891 L20 []#2935 processing segment 0
> 23:50:06.891 L20 []#2936 getting blockhashes for segment 0, share 0:
> set([0])
> 23:50:06.891 L20 []#2937 validating share 0 for segment 0
> 23:50:06.891 L20 []#2938 the reader gave me the following blockhashes:
> [0]
> 23:50:06.891 L20 []#2939 the reader gave me the following sharehashes: []
> 23:50:06.891 L20 []#2940 share 0 is valid for segment 0
> 23:50:06.891 L20 []#2941 trying to decode and decrypt segment 0
> 23:50:06.891 L20 []#2942 everything looks ok, building segment 0
> 23:50:06.891 L20 []#2943 decoding segment 0
> 23:50:06.892 L10 []#2944 now decoding segment 0 of 1
> 23:50:06.892 L20 []#2945 joined length 325, datalength 325
> 23:50:06.892 L20 []#2946 segment len=325
> 23:50:06.892 L20 []#2947 decrypting segment 0
> 23:50:06.892 L20 []#2948 got plaintext for segment 0
> 23:50:06.892 L20 []#2949 stripping 0 bytes off of the first segment
> 23:50:06.892 L20 []#2950 original segment length: 325
> 23:50:06.892 L20 []#2951 new segment length: 325
> 23:50:06.892 L20 []#2952 adding 0 new servers to the active list
> 23:50:06.892 L20 []#2953 got plaintext, done
> 23:50:06.893 L20 []#2954 web: 192.168.1.32 GET /uri/[CENSORED]..?t=json
> 404 22
> 23:50:07.038 L20 []#2955 SharemapUpdater(ootig): starting (MODE_READ)
> 23:50:07.039 L10 []#2956 sending query to [e7ph56p4], readsize=4000
> 23:50:07.363 L20 []#2957 got result from [e7ph56p4], 1 shares
> 23:50:07.364 L10 []#2958 _got_results done
> 23:50:07.367 L10 []#2959 _got_results: got shnum #0 from serverid
> e7ph56p4
> 23:50:07.367 L20 []#2960 found valid version 6-uif6 from e7ph56p4-sh0:
> 1-1/325/325
> 23:50:07.367 L10 []#2961 _check_for_done, mode is 'MODE_READ', 0 queries
> outstanding, 0 extra servers available, 0 'must query' servers left,
> need_privkey=False
> 23:50:07.367 L20 []#2962 all queries are retired, no extra servers: done
> 23:50:07.368 L20 []#2963 servermap: 1*seq6-uif6
> 23:50:07.368 L10 []#2964 _check_for_done, mode is 'MODE_READ', 0 queries
> outstanding, 0 extra servers available, 0 'must query' servers left,
> need_privkey=False
> 23:50:07.368 L10 []#2965 but we're not running
> 23:50:07.383 L20 []#2966 Retrieve(ootig): starting
> 23:50:07.383 L20 []#2967 got seqnum 6
> 23:50:07.383 L20 []#2968 got encoding parameters: k: 1 n: 1 1 segments of
> 325 bytes each (325 byte tail segment)
> 23:50:07.383 L20 []#2969 starting download
> 23:50:07.390 L20 []#2970 adding 1 new servers to the active list
> 23:50:07.390 L20 []#2971 added reader for share 0
> 23:50:07.390 L20 []#2972 on segment 1 of 1
> 23:50:07.390 L20 []#2973 processing segment 0
> 23:50:07.390 L20 []#2974 getting blockhashes for segment 0, share 0:
> set([0])
> 23:50:07.390 L20 []#2975 validating share 0 for segment 0
> 23:50:07.390 L20 []#2976 the reader gave me the following blockhashes:
> [0]
> 23:50:07.390 L20 []#2977 the reader gave me the following sharehashes: []
> 23:50:07.390 L20 []#2978 share 0 is valid for segment 0
> 23:50:07.390 L20 []#2979 trying to decode and decrypt segment 0
> 23:50:07.390 L20 []#2980 everything looks ok, building segment 0
> 23:50:07.390 L20 []#2981 decoding segment 0
> 23:50:07.390 L10 []#2982 now decoding segment 0 of 1
> 23:50:07.390 L20 []#2983 joined length 325, datalength 325
> 23:50:07.390 L20 []#2984 segment len=325
> 23:50:07.390 L20 []#2985 decrypting segment 0
> 23:50:07.390 L20 []#2986 got plaintext for segment 0
> 23:50:07.390 L20 []#2987 stripping 0 bytes off of the first segment
> 23:50:07.390 L20 []#2988 original segment length: 325
> 23:50:07.390 L20 []#2989 new segment length: 325
> 23:50:07.390 L20 []#2990 adding 0 new servers to the active list
> 23:50:07.390 L20 []#2991 got plaintext, done
> 23:50:07.391 L20 []#2992 CHKUploader starting
> 23:50:07.391 L20 []#2993 starting upload of
> <allmydata.immutable.upload.EncryptAnUploadable instance at
> 0x7f72609fb878>
> 23:50:07.392 L20 []#2994 creating Encoder <Encoder for unknown storage
> index>
> 23:50:07.392 L20 []#2995 file size: 4400830
> 23:50:07.392 L10 []#2996 my encoding parameters: (1, 1, 1, 131072)
> 23:50:07.392 L20 []#2997 got encoding parameters: 1/1/1 131072
> 23:50:07.392 L20 []#2998 now setting up codec
> 23:50:07.416 L20 []#2999 using storage index vukxr
> 23:50:07.416 L20 []#3000 <Tahoe2ServerSelector for upload vukxr>(vukxr):
> starting
> 23:51:46.766 L20 []#3001 None
> 23:51:46.767 L20 []#3002 web: 192.168.1.9 GET /status/up-1 200 732
>
> (last two lines must be about me downloading status webpage)
>
> I tried with other file and both download and upload works. I noticed it
> during backup - tahoe was not able to do a backup. I identified that one
> file is problematic and tahoe-lafs stucks at this file.

New description:

 Rafal Jedruszek reported this problem with S4.

 I cannot download one of my files. Just nothing is happening on download
 trial. Status screen shows 0.0% progress and "fetching segment 0".

 Logs are:
 {{{
 starting..
 Connecting..
 Connected (to pid 915)
 Remote Versions:
 Nevow: 0.11.1
 Twisted: 14.0.2
 allmydata-tahoe: 1.10.0
 foolscap: 0.6.5
 mock: 1.0.1
 platform: Linux-debian_8.0-x86_64-64bit_ELF
 pyOpenSSL: 0.14
 pyasn1: 0.1.7
 pycrypto: 2.6.1
 pycryptopp: 0.6.0.1206569328141510525648634803928199668821045408958
 python: 2.7.8
 setuptools: 5.5.1
 simplejson: 3.6.5
 twisted: 14.0.2
 zfec: 1.4.5
 zope.interface: unknown
 23:43:38.715 L20 []#1704 Immutable.DownloadNode(vukxrvahqlwk) created:
 size=4400830, guessed_segsize=131072, guessed_numsegs=34
 23:43:38.715 L10 []#1705 ShareFinder[si=vukxrvahqlwk] starting
 23:43:38.715 L20 []#1706 imm Node(vukxrvah).read(0, 4400830)
 23:43:38.715 L10 []#1707 _fetch_next(offset=0) probably wants segnum=0
 23:43:38.715 L20 []#1708 imm Node(vukxrvah).get_segment(0)
 23:43:38.715 L10 []#1709
 ImmutableDownloadNode(vukxrvahqlwk)._start_new_segment: segnum=0
 23:43:38.728 L10 []#1710 ShareFinder[si=vukxrvahqlwk] hungry
 23:43:38.731 L10 []#1711 ShareFinder loop: running=True hungry=True,
 pending=
 23:43:38.731 L10 []#1712 sending DYHB to [e7ph56p4]
 23:43:38.736 L10 []#1713 ShareFinder loop: running=True hungry=True,
 pending=e7ph56p4
 23:43:48.732 L10 []#1714 ShareFinder loop: running=True hungry=True,
 pending=e7ph56p4
 }}}
 And it stays this way.

 On upload try also - nothing is happening. Web page shows "hash: 100.0%,
 ciphertext: 0.0%, encode: 0.0%", "Status: Contacting Servers [e7ph56p4]
 (first query), 0 shares left..".

 Logs:
 {{{
 23:50:06.576 L20 []#2917 SharemapUpdater(ootig): starting (MODE_READ)
 23:50:06.576 L10 []#2918 sending query to [e7ph56p4], readsize=4000
 23:50:06.867 L20 []#2919 got result from [e7ph56p4], 1 shares
 23:50:06.867 L10 []#2920 _got_results done
 23:50:06.871 L10 []#2921 _got_results: got shnum #0 from serverid e7ph56p4
 23:50:06.871 L20 []#2922 found valid version 6-uif6 from e7ph56p4-sh0:
 1-1/325/325
 23:50:06.871 L10 []#2923 _check_for_done, mode is 'MODE_READ', 0 queries
 outstanding, 0 extra servers available, 0 'must query' servers left,
 need_privkey=False
 23:50:06.871 L20 []#2924 all queries are retired, no extra servers: done
 23:50:06.871 L20 []#2925 servermap: 1*seq6-uif6
 23:50:06.871 L10 []#2926 _check_for_done, mode is 'MODE_READ', 0 queries
 outstanding, 0 extra servers available, 0 'must query' servers left,
 need_privkey=False
 23:50:06.871 L10 []#2927 but we're not running
 23:50:06.884 L20 []#2928 Retrieve(ootig): starting
 23:50:06.884 L20 []#2929 got seqnum 6
 23:50:06.885 L20 []#2930 got encoding parameters: k: 1 n: 1 1 segments of
 325 bytes each (325 byte tail segment)
 23:50:06.885 L20 []#2931 starting download
 23:50:06.891 L20 []#2932 adding 1 new servers to the active list
 23:50:06.891 L20 []#2933 added reader for share 0
 23:50:06.891 L20 []#2934 on segment 1 of 1
 23:50:06.891 L20 []#2935 processing segment 0
 23:50:06.891 L20 []#2936 getting blockhashes for segment 0, share 0:
 set([0])
 23:50:06.891 L20 []#2937 validating share 0 for segment 0
 23:50:06.891 L20 []#2938 the reader gave me the following blockhashes: [0]
 23:50:06.891 L20 []#2939 the reader gave me the following sharehashes: []
 23:50:06.891 L20 []#2940 share 0 is valid for segment 0
 23:50:06.891 L20 []#2941 trying to decode and decrypt segment 0
 23:50:06.891 L20 []#2942 everything looks ok, building segment 0
 23:50:06.891 L20 []#2943 decoding segment 0
 23:50:06.892 L10 []#2944 now decoding segment 0 of 1
 23:50:06.892 L20 []#2945 joined length 325, datalength 325
 23:50:06.892 L20 []#2946 segment len=325
 23:50:06.892 L20 []#2947 decrypting segment 0
 23:50:06.892 L20 []#2948 got plaintext for segment 0
 23:50:06.892 L20 []#2949 stripping 0 bytes off of the first segment
 23:50:06.892 L20 []#2950 original segment length: 325
 23:50:06.892 L20 []#2951 new segment length: 325
 23:50:06.892 L20 []#2952 adding 0 new servers to the active list
 23:50:06.892 L20 []#2953 got plaintext, done
 23:50:06.893 L20 []#2954 web: 192.168.1.32 GET /uri/[CENSORED]..?t=json
 404 22
 23:50:07.038 L20 []#2955 SharemapUpdater(ootig): starting (MODE_READ)
 23:50:07.039 L10 []#2956 sending query to [e7ph56p4], readsize=4000
 23:50:07.363 L20 []#2957 got result from [e7ph56p4], 1 shares
 23:50:07.364 L10 []#2958 _got_results done
 23:50:07.367 L10 []#2959 _got_results: got shnum #0 from serverid e7ph56p4
 23:50:07.367 L20 []#2960 found valid version 6-uif6 from e7ph56p4-sh0:
 1-1/325/325
 23:50:07.367 L10 []#2961 _check_for_done, mode is 'MODE_READ', 0 queries
 outstanding, 0 extra servers available, 0 'must query' servers left,
 need_privkey=False
 23:50:07.367 L20 []#2962 all queries are retired, no extra servers: done
 23:50:07.368 L20 []#2963 servermap: 1*seq6-uif6
 23:50:07.368 L10 []#2964 _check_for_done, mode is 'MODE_READ', 0 queries
 outstanding, 0 extra servers available, 0 'must query' servers left,
 need_privkey=False
 23:50:07.368 L10 []#2965 but we're not running
 23:50:07.383 L20 []#2966 Retrieve(ootig): starting
 23:50:07.383 L20 []#2967 got seqnum 6
 23:50:07.383 L20 []#2968 got encoding parameters: k: 1 n: 1 1 segments of
 325 bytes each (325 byte tail segment)
 23:50:07.383 L20 []#2969 starting download
 23:50:07.390 L20 []#2970 adding 1 new servers to the active list
 23:50:07.390 L20 []#2971 added reader for share 0
 23:50:07.390 L20 []#2972 on segment 1 of 1
 23:50:07.390 L20 []#2973 processing segment 0
 23:50:07.390 L20 []#2974 getting blockhashes for segment 0, share 0:
 set([0])
 23:50:07.390 L20 []#2975 validating share 0 for segment 0
 23:50:07.390 L20 []#2976 the reader gave me the following blockhashes: [0]
 23:50:07.390 L20 []#2977 the reader gave me the following sharehashes: []
 23:50:07.390 L20 []#2978 share 0 is valid for segment 0
 23:50:07.390 L20 []#2979 trying to decode and decrypt segment 0
 23:50:07.390 L20 []#2980 everything looks ok, building segment 0
 23:50:07.390 L20 []#2981 decoding segment 0
 23:50:07.390 L10 []#2982 now decoding segment 0 of 1
 23:50:07.390 L20 []#2983 joined length 325, datalength 325
 23:50:07.390 L20 []#2984 segment len=325
 23:50:07.390 L20 []#2985 decrypting segment 0
 23:50:07.390 L20 []#2986 got plaintext for segment 0
 23:50:07.390 L20 []#2987 stripping 0 bytes off of the first segment
 23:50:07.390 L20 []#2988 original segment length: 325
 23:50:07.390 L20 []#2989 new segment length: 325
 23:50:07.390 L20 []#2990 adding 0 new servers to the active list
 23:50:07.390 L20 []#2991 got plaintext, done
 23:50:07.391 L20 []#2992 CHKUploader starting
 23:50:07.391 L20 []#2993 starting upload of
 <allmydata.immutable.upload.EncryptAnUploadable instance at
 0x7f72609fb878>
 23:50:07.392 L20 []#2994 creating Encoder <Encoder for unknown storage
 index>
 23:50:07.392 L20 []#2995 file size: 4400830
 23:50:07.392 L10 []#2996 my encoding parameters: (1, 1, 1, 131072)
 23:50:07.392 L20 []#2997 got encoding parameters: 1/1/1 131072
 23:50:07.392 L20 []#2998 now setting up codec
 23:50:07.416 L20 []#2999 using storage index vukxr
 23:50:07.416 L20 []#3000 <Tahoe2ServerSelector for upload vukxr>(vukxr):
 starting
 23:51:46.766 L20 []#3001 None
 23:51:46.767 L20 []#3002 web: 192.168.1.9 GET /status/up-1 200 732
 }}}
 (last two lines must be about me downloading status webpage)

 I tried with other file and both download and upload works. I noticed it
 during backup - tahoe was not able to do a backup. I identified that one
 file is problematic and tahoe-lafs stucks at this file.

--

--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2365#comment:1>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list