use case: sharing media content between orgs
Gulyás Attila
toraritte at gmail.com
Wed Dec 2 19:56:53 UTC 2020
Hi Jean-Paul,
I truly appreciate your detailed answer to my underspecified (to say
the least) question and you raised excellent points!
> At a very high level, this sounds plausible. It is true that different
> organizations can collaborate to form a Tahoe-LAFS storage "grid".
Thanks for the confirmation!
> We talk about the "grid" a lot but you really have to apply a lot of
> abstractions to create this concept with Tahoe-LAFS right now (though
> there's some work underway to turn it into a more concrete thing).
I believe I understand what you mean here but
would you be able to provide more info about the
work currently underway? (Trac ticket, blog post,
anything, just to make sure I got it right.)
> A Tahoe-LAFS "grid" is just one or more storage servers being used
> together and you have a lot of flexibility around where those storage
> servers come from.
Yes, this is the main thing that grabbed my
attention.
> At the content level, how would different organizations offer content to
> save other organizations labor?
> One area that you probably want to think about more is how those
> organizations coordinate with each other in their use of these servers
> and the creation and consumption of content on them.
The original idea was that organizations wouldn't
use the "grid" directly; a service would be built on
top of it via the Tahoe LAFS REST API that would
also have public APIs for other organizations'
application(s) to consume. Tahoe-LAFS would be
a "flat" file/object storage layer (holding the
metadata and hierarchical structure as well?) and
the service would also serve as access control.
Using Tahoe-LAFS this way would of course lose all
its distributed properties because of the central
service on top, but it would provide a cloud
vendor-agnostic layer.
An example would be that Org_A's volunteers record
the Safeway ads, upload it to this central service,
and other organization check if there is a current
Safeway flyer recorded for a given week, and if
there is, they wouldn't have to assign it to their
volunteers.
All this may be a very naive idea, and the hope of
other organizations willing to participate in such
endevour may be a pipe dream anyway. (The governance
of such multi-organizational collaboration would
also be tricky but it's also off-topic here.) The
main reason I looked at Tahoe-LAFS at first was that
when we were working on overhauling our 20+ year old
system, we started on Google Cloud Engine but had to
migrate to Azure; migrating VMs was relatively easy
but each vendor has vastly different cloud storage
APIs, limitations, specs, etc. and it would be a
pain to go through this one more time.
> For example, would it make more sense to have a fully open grid or a
> private grid open only to participating organizations?
Thank you for raising this question because it
didn't even occurred to me. Audio information
services provide a lot of public domain content
(e.g., old time radio shows, store ad flyers,
newsletters) but they are also able to air
copyrighted materials pursuant to 17 U.S.C. § 121
(https://uscode.house.gov/view.xhtml?req=granuleid:USC-prelim-title17-section121&num=0&edition=prelim)
. Ideally, the former would be made available
to anyone and disseminated as freely as possible
whereas only print-disabled individuals should be
able to access the latter.
This the part where I get confused: I guess it
is possible to build the above mentioned central
service to reflect these requirements, but would
there have to be 2 separate "grids"? I mean,
volunteers could also pitch in storage nodes, but
that would also mean that they would have access to
all the data as well, correct?
> Perhaps each organization maintains a directory (or directory hierarchy)
> which only it can write to but which all other organizations can read
> from. This might be done on an ad hoc basis or with a tool like
> magic-folder (which is currently very much a work in progress). Then
> organizations would browse the read-only directories shared with them by
> other organizations to see if the desired content already exists. If
> found, they can retrieve and use it. If not, they can create and upload
> it.
Even though I mentioned earlier that this wouldn't
be a direct access "grid", I still appreciate
this description because (1) the central service
notion may not be viable at all and (2) this
straightforward example made me understand some of
the concepts I struggled with when reading the
manual.
> This seems workable - however, I wonder if there is an advantage to
> using Tahoe-LAFS over another system. For example, Google Drive and
> Dropbox would offer comparable experiences, I think, without the need to
> operate storage servers.
Oh, do you mean that Tahoe-LAFS can be used over
Google Drive, OneDrive, Dropbox, and ilk? If yes
then it would make things way simpler (and I'm not
sure how I missed this...).
> Or, to avoid proprietary, centralized systems, NextCloud has file
> storage and sharing capabilities. It is not a distributed system but
> it's easy to find a commercial offering that could be shared across
> organizations. This comes at a cost - but so does operating Tahoe-LAFS
> storage servers, and I suspect NextCloud hosting is price competitive
> (unless volunteer labor can be discounted, perhaps).
Just looked up NextCloud and will have to look into
it some more but I really like how anyone could just
chip in into a Tahoe-LAFS based system. Then again,
if an organization quits/goes out of business/etc.,
one would have to figure out how the lost nodes
would affect the "grid" (if the "grid" too small,
that is, right?). I may be also oversimplifying here
(hello Dunning-Kruger effect).
> Or maybe it is the case that the loose group of organizations actually
> benefit significantly from the distributed nature of Tahoe-LAFS -
> perhaps because the operation of the software more closely matches the
> relationships of the organizations to each other?
This is so spot on that I feel mad not being able to
put this into words myself.
Thanks again!
Attila
On Tue, Dec 1, 2020 at 2:13 PM Jean-Paul Calderone
<jean-paul+tahoe-dev at leastauthority.com> wrote:
>
> On Fri, Nov 27, 2020 at 3:45 AM Gulyás Attila <toraritte at gmail.com> wrote:
>>
>> Hi,
>>
>> Would it be a valid use case for Tahoe-LAFS to share media content
>> among organizations?
>>
>> To elaborate, there are many reading services for the blind all across
>> the States (https://en.wikipedia.org/wiki/Radio_reading_service) and
>> volunteer effort is duplicated for certain topics (e.g., grocery store
>> flyers for large chains are the same for every state, yet services
>> have their volunteers read these every week, independently of each
>> other). These services all use different storage solutions (i.e.,
>> on-site servers and different cloud vendors) and based on what I read
>> so far Tahoe-LAFS could help to bridge this gap: organizations that
>> choose to join would be able to contribute storage space to the grid
>> and everyone would have uniform access to the content.
>>
>> Am I missing something? Thanks in advance!
>
>
> Hi Gulyás,
>
> At a very high level, this sounds plausible. It is true that different organizations can collaborate to form a Tahoe-LAFS storage "grid". We talk about the "grid" a lot but you really have to apply a lot of abstractions to create this concept with Tahoe-LAFS right now (though there's some work underway to turn it into a more concrete thing).
>
> A Tahoe-LAFS "grid" is just one or more storage servers being used together and you have a lot of flexibility around where those storage servers come from.
>
> It's true that each volunteer/service organization could operate one or more storage servers and that a Tahoe-LAFS storage client could be configured to use any and all of these servers for storage. One area that you probably want to think about more is how those organizations coordinate with each other in their use of these servers and the creation and consumption of content on them.
>
> For example, would it make more sense to have a fully open grid or a private grid open only to participating organizations? A fully open grid can accept contributions of resources from more participants but it also gives out storage access to more participants as well. A private grid might make more sense but comes with operational security requirements to ensure it remains private.
>
> At the content level, how would different organizations offer content to save other organizations labor? Perhaps each organization maintains a directory (or directory hierarchy) which only it can write to but which all other organizations can read from. This might be done on an ad hoc basis or with a tool like magic-folder (which is currently very much a work in progress). Then organizations would browse the read-only directories shared with them by other organizations to see if the desired content already exists. If found, they can retrieve and use it. If not, they can create and upload it.
>
> This seems workable - however, I wonder if there is an advantage to using Tahoe-LAFS over another system. For example, Google Drive and Dropbox would offer comparable experiences, I think, without the need to operate storage servers. Or, to avoid proprietary, centralized systems, NextCloud has file storage and sharing capabilities. It is not a distributed system but it's easy to find a commercial offering that could be shared across organizations. This comes at a cost - but so does operating Tahoe-LAFS storage servers, and I suspect NextCloud hosting is price competitive (unless volunteer labor can be discounted, perhaps).
>
> Or maybe it is the case that the loose group of organizations actually benefit significantly from the distributed nature of Tahoe-LAFS - perhaps because the operation of the software more closely matches the relationships of the organizations to each other?
>
> Does this make sense? Does it help? I'm happy to consider any follow-up questions.
>
> Jean-Paul
>
>
>>
>>
>> Appreciatively,
>> Attila Gulyas | IT/Program Assistant
>> Email: agulyas at societyfortheblind.org
>> Phone: (916) 889-7510
>>
>> Access News helpdesk:
>> (916) 889-7519
>> accessnews at societyfortheblind.org
>>
>> Access News system:
>> (800) 665-4667
>> (916) 732-4000
>>
>> Society for the Blind
>> 1238 S Street
>> Sacramento, CA 95811
>>
>> SFTB Main Phone: (916) 452-8271
>> Fax: (916) 492-2483
>> www.societyfortheblind.org
>>
>> Our mission is to empower individuals living with low vision or
>> blindness to discover, develop and achieve their full potential.
>> _______________________________________________
>> tahoe-dev mailing list
>> tahoe-dev at tahoe-lafs.org
>> https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
More information about the tahoe-dev
mailing list