User talk:Kat/Western Africa
Discussion space for the preparation of the Western Africa FCT for integration with WikiFCD
Intro thoughts (no reply needed)
Hi Kat + Mika, Thanks again for letting me join the WikiFCD effort. It's fun to be involved, and (as expected) there's no quicker way to learn about the technology than to contribute to it. I'm making decent progress on the my initial task (adding WikiFCD QIDs to the W.Africa FCT), but it's slow going because I've resorted to doing it totally manually. I spent some time trying to figure out a more automated way to do it, but in the end the manual method seemed like it was going to be quickest with the tools/skills I had available to me. The faster approach here does seem to be a simple(?) query script, in which one feeds all the unique Scientific Names as an a search array, and then returns the matching QID (if existing) for each of them. I just wasn't clever enough to know how to write that kind of a query. As I've been populating the spreadsheet by hand, I've noticed (so far) 3 typical dilemmas that pop up. I'll explain each of them in a separate post, to enable separate responses. Kat- hope you had a great vacation!
1. Locating right level in taxonomy
Sometimes it can be tricky to know whether an FCT entry should be mapped against a genus, a species, or a specific varietal in WikiFCD. Most often, the FCT is not specific ("amaranth"), and WikiFCD has existing entries for the genus and many species (and often multiple varietals) within that. There's an argument for mapping the FCT entry to the highest relevant match in WikiFCD -- nutritional properties should accrue to the more general class, when something narrower is not specified. But there's also an argument for mapping the FCT entry to a species (or even varietal?) when possible -- perhaps the overall genus functions more as a "food category" in WikiFCD than a specific "food item" per se. Guidance/thoughts?
2. Transformed/derived/processed foods
The FCT has many foods that are "transformed" versions of the original crop (e.g., nut oil), and foods that are "derived" from something else (e.g., milk from cows). It seems inappropriate to map these FCT entries to WikiFCD items for the original food item (eg, nuts and cows respectively). However, finding the appropriately matching food item for these "secondary" food products gets complicated quickly. Eg, there are milks with many different nutritional contents (skim vs low-fat vs whole, pasturized vs raw,...). Two possible approaches come to mind: a. Find the most analogous food item in WikiFCD, where the WA FCT entry and the WikiFCD entry seem to be capturing the same thing. If the WA FCT entry says, "raw whole cow milk", then don't map it to anything lowfat or pasturized in WikiFCD. Get as close as possible. b. Don't even try to find a match, because W.African raw cow milk is inherently distinct from cow milk produced in Europe/USA. We want WikiFCD to capture the richness/diversity of foods produced in different places, so it does a disservice to W.African milk to map it to a QID of *any* other milk product from anywhere. Thoughts/guidance?
3. Geographical variance
To what degree are WikiFCD items supposed to be geography-specific? If Malawi and Nigeria have two different nutritional values for the ostensibly same type of food, how do we treat this? Several approaches seem possible: a. Capture both sets of nutritional values as "multiple versions of the truth" -- allowing multiple values for (eg) fibre content for the same QID. b. Treat geography as part of the food identifier -- create separate QIDs for each location-varietal combination, so that (eg) "Malawi peanuts" and "Nigerian peanuts" are not treated as the same food, but more like local "versions" of the food, each with its authoritative nutritional values. Thoughts/guidance?
Initial draft complete
I think I'm done with this task. WikiFCD QIDs, where existing, have been added to the WA FCT file. Some notes:
I created a "working version" of the WA FCT file. Here is a link to that file: Working Copy of WA FCT
This file is *exactly* the same as the source file (except, of course, with the addition of the entries in Column H of Tab03), with one small addition-- I have added a new tab ("Lookup") that I used for recording WikiFCD QIDs and some notes for some entries. This tab doesn't play a functional role in the file, so you can delete if you need to. But I left it in, since there are some notes there you might want to read because they have explanations for a handful of a few not-so-straightforward mappings.
Thank you for your work and for sharing the file! I'll take a look at this today. Kat (talk) 16:58, 5 October 2022 (UTC)