00:00.00 archpodnet Welcome back to the archeotech podcast episode 2 13 and we're talking about an issue from advances in archeological practice which you can find a link to in the show notes. So check that out if you haven't done so already and. Yeah, Paul I think you had some thoughts on on what we were talking about at the end of the last segment. 00:19.94 Paul Yeah, so um, one of the other threads of the things that we're thinking about and we're talking about organization. How do you organize your data so that you can find it again and how do you organize it. So other people can find it is that um, having this ah digital management plan is part of ah. What we have to explain what we're doing. We have to have it written out for that and nsf grant but we're also exploring you know there's data archiving on the 1 hand which is what I was talking about with the cloud storage. But then there's also the archiving it. And making it accessible for more people in other words so we've been involved with open context and and discussing with them how we will host our data there. So what we have is this really complex web of we've got file-based data. 01:00.00 archpodnet Um. 01:08.30 Paul We've got our own small databases for recording things The you know the ceramassis database The final Analyst database. Um, you know I make databases because I'm an idiot um I make databases to handle all sorts of data that I deal with survey data. Um, and these ones aren't. 01:22.10 archpodnet Yeah. 01:27.67 Paul These are just temporary sorts of things these are conduits but then they go into a bigger one. That's hosted a pen that is supposed to be publicly accessible and that's more of a long-term storage but then we also are like I said with open context we have to figure out how to translate our data using their standards so that we can get. Select pieces of our data hosted by them and accessible to the to the wider audience. So it's ah it's this really complicated complex. Um web of different kinds of data and how they get stored and what gets stored where and why how it gets recorded and again back to the yeah, the first of these 2 articles. 01:51.59 archpodnet Ah. 02:05.68 Paul Um, its title is ah us is a systems thinking model of data management and use in us archaeology by Elizabeth Balwark Nia Gupta and Jolene Smith and just seeing that the systems' thinking. 02:20.88 archpodnet Um. 02:24.49 archpodnet Yeah. 02:24.90 Paul Tickled me because as I Bri mentioned before before I dropped it um before I dropped I T to come back into full-time field archaeology my job at the school was was systems Manager I was the one that knew how if you changed the name of a file form of a um. Ah, username format here. It's going to break some other system over there. Ah yeah, and all these little things that you wouldn't know until you've like poked and prodded and so seeing this forefronted that you have to think of it as a system of interrelated data Management techniques. Ah. 02:46.23 archpodnet Over. 03:03.20 Paul Really is appealing to me. It's just innately appealing because it's what I've had to deal with you know for a couple decades. Um, and it isn't an answer. It's not like oh all, you have to do is sprinkle the systems' thinking thought onto onto how you're dealing with your data. But if you know that you have to think about it as a system. 03:11.29 archpodnet Um. 03:22.51 Paul What happens when we change these names if I collect it in this format or save it in that format who else out there in the world can use it do I have to then translate it to another format that somebody else can use. Um do I have to add metadata to it. Do I have to add and here's a term that I hadn't heard before but I love it. Paradata ah. 03:33.31 archpodnet Ah. 03:39.16 archpodnet Okay, yeah. 03:42.63 Paul And that was brought up and that's basically documentation of the processes of data collection. So it's not just like you know Photograph might have metadata that you use to things about the camera the lens the exposure the geoloccation the date time so on so forth. Ah paradata. 03:46.11 archpodnet Oh. 04:01.34 Paul Would be 1 step above that I guess explaining why you're using that camera why you're doing why you were taking photographs what your decision was your decision process for what needed to be photographed. Ah you know so it's it's a higher level thing and they suggested um. 04:07.13 archpodnet Um, yeah. 04:17.63 Paul That that also has to be documented in order to make the data even with good metadata to make the data more usable is that people have to understand the data. Don't just happen by Magic. You don't just press the record button and and now you've got your data. Um, even if you are just pressing the record button. There have been a lot of decisions that went into that before you hit that record button. Um, so one of the other things I've enjoyed is another site called Protocols Io. It's mostly geared towards ah towards people in laboratory sciences for publishing their own protocols. 04:39.20 archpodnet Yeah. 04:53.41 Paul UnkS the name of of different lab techniques. Um, but I think I've got a couple articles up there now for gis work. They've done and I think that that's something that people ought to look into for documenting how and why we get our data the way that we do So anyhow this whole systems thinking approach was. Was really interesting to me and that we you know it's the problem is at once really complex and once you start to see the threads of how things tie together. It becomes much simpler if that makes any sense. 05:25.40 archpodnet But that seems to be the the hardest thing that that these archeologists and people doing this you know data collection planning. That's the hardest thing they have to do is is really looking outside of their. Project from a data standpoint right? I mean I I would say a lot of people try to look at what they're doing in the project they're doing in the grander scheme of things from an archeological standpoint right? That's kind of like one of those what does it all mean things when you're writing up the report right? It's like how does this fit in context of everything else around here. 05:49.74 Paul 1 more. 05:56.22 archpodnet But that's where it stops right? Nobody thinks about it. They use their data to come to that conclusion but they don't think about how their data fits in the wider context of things and and what their data is going to be used for and it's almost like I don't know I almost see it as ah, you know academic projects where I used to. Especially working with them with codify and then with wild note talking to graduate students and you know many graduate students. We talked to that were like really proud of the filemaker database. They put together for their project and I'm like did you do you really think you had to invent this for that project is that like one of the things that you really spend a lot of time doing I mean great Job. It looks nice. 06:24.28 Paul Easy. 06:34.28 archpodnet But why did you reinvent the wheel right? Why did you do this? Why didn't you not find some other you know standard that you were going to try to fit your data into so when you're done with it. It actually goes somewhere rather than creating this whole database just for you just for your own thing that nobody else understands and. You you know, organize things in a different Way. So It's like ah it's like you know we're trained to think of the the big picture from ah from a and and ah an analysis standpoint but not from a longevity of the data standpoint you know and man I always. 07:06.67 Paul Right. 07:10.78 archpodnet Always think back to that. 1 example that is just always in my mind from Chaco Canyon with the charcoal and the the guys that first some of the first europeans that saw that were were just hanging out in pueblo benito and they said hey. You know we need. We're going to spend the night here we need to make a fire and so they made a fire where there was already a fire in the room and completely destroyed any possibility of carbon 14 dating that wall by you know, completely contaminating the entire thing and nobody thought later on that you'd even be able to do anything with that right? I mean of course it was the. 07:30.26 Paul We. 07:38.37 Paul Yeah, yeah. 07:45.20 archpodnet 1920 s or 15 s or something like that. Whatever it was maybe even earlier and so how could they have known but that's the whole plane. How could you have known you know and it's it's just ah, not enough thought is put into that you know when when people are doing these things. 07:52.80 Paul Move. 08:00.20 Paul Yeah, which again is why I'm you know happy that we've had this this moment this forced break to to think these things because for the data collection like I said sometimes it does make sense to have that little bespoke database but that can't be the end product. 08:07.66 archpodnet Um. 08:18.10 archpodnet Yeah. 08:19.80 Paul You know the the one that makes it very easy for the ah for the cerabassis to collect her data quickly and easily and with what exactly what she needs for hernalyses is great, but it has to you know? and so we have again 2 steps up beyond that one being the the database is hosted a pen and then the other one. Being open context. So it's going to flow up through that and but we have to think that how we're going to do that otherwise it becomes a um, it becomes. It's always going to be a challenge but it becomes a real task to try to figure out how you're going to shoehorn it. But if you could plan it. 08:41.80 archpodnet Ah. 08:56.40 Paul From the start how it's going to flow it becomes easier and that's what I'm saying about the threads in the systems thinking if you if you look at how things relate it. It can sometimes make things easier right? then rather than getting I mean there was another in one of the other articles. Um, there was a. 09:07.65 archpodnet Yeah. 09:15.11 Paul Term that got thrown out called inherited digital messes like oh I know what that is oh yes I've seen those before and I guess what we want to do I mean you're not going to be able to avoid that because there are digital messes out there that are. 09:20.66 archpodnet Ah, nice. Yes. 09:30.88 Paul Not your fault. Not anybody else's fault. They just exist because they were done twenty thirty years ago and that's the way it is too bad. Ah, but if you have the opportunity to not create new digital messes going forward that that seems to be where we have to be striving to get to. 09:42.80 archpodnet Oh. What do you think is the the biggest problem with trying to figure out what to do with the data set before you even collect. It. Not even possibly knowing about much about what you're about to collect right? especially from a crm standpoint you're just going to an area. Sure you can do the research. And and have an idea of what you might find there. But I mean with you having worked in such wildly different areas as compared to where I've worked I mean Saudi Arabia you know other parts of the Middle East and and then you know various places in in the United States and and other places in the world like how. Do you think there's a a possibility that you could just create a dataset or create a way to contain all the data from a project without actually even caring where you're going or or where you or what you're doing and saying I can fit whatever I find here into this container so to speak. Is that even something we can. We should be striving for or is that too restrictive. 10:47.19 Paul Um I I waffle I initially wanted to make doing something like that a centerpiece of my dissertation research and then after a while I decided Nope can't be done. Not it's it's too big of a problem. 11:00.39 archpodnet Ah, yeah. 11:01.68 Paul Um, and this is a long time ago now I mean we have better tools now and it's more possible than by 1 grad student is also trying to struggle with his own data and understand things in and read things in different languages. He doesn't really know and so on and so forth. Um, oh and raise kids at the same time. Sorry. 11:19.19 archpodnet Oh yeah, yeah, no worries. Yeah, that would little thing. 11:21.21 Paul It wasn't going to happen. Ah it. It needs to be tackled by people that really know what they're doing and there's certainly data sciences has undergone a revolution in the last ten fifteen years um and so there are people that are definitely better poised to handling that and answering that in the affirmative. Ah, then when I looked at it most closely but where was I going with this is it possible? Yes, but the problem again I think is back at the start is that if you have all possibilities open. 11:40.56 archpodnet Um. 11:47.77 archpodnet Yeah. 11:58.69 Paul You don't know what to collect and what not to collect right? You have to be able to put things in a ah particular bucket right? put things in certain fields put things so that they can go through but you can't present every option right? The start. 12:00.76 archpodnet Right. 12:09.13 archpodnet Ah. 12:14.75 archpodnet Right. 12:17.80 Paul Yeah, and if you do so the yeah the the school information system that we used when I started in 2000 um had been used for a few years and it was very openended and very usable by a bunch of different schools. Um, but. A lot of the administrators who were Using. It didn't know really the ins and outs of database practices. So One of the first things that we did um when my when the new director of I T came in was he went and started cleaning up the data. 1 of the things that people had been doing for a number of years is putting Asterisks and. Like dollar signs or things on people's names to indicate if they were a new hire because they didn't seem to know that there was someplace else in a record the the higher date and that you could use that instead to to filter out but you know then. 12:58.50 archpodnet Okay. 13:12.50 Paul They they had to go through Crystal reports in order to get that out. It was much easier for them just to get a list of the the names and then manually pull out everybody that had an asterisk. Um, and so that that gets at you know training that gets a data use that gets at the design implementation of the data entry. 13:17.42 archpodnet Oh. 13:30.34 Paul Ah, these are all interrelated against systems thinking these are all interrelated problems that if not done right? make it really hard to correct collect the proper data in the field. So is that higher level goal attainable. 13:40.26 archpodnet Yeah. 13:47.81 Paul Probably but it has to be thought from the ground up. 13:48.61 archpodnet M. 13:54.69 archpodnet Right? Okay, well I have another question for you. We'll do that on the other side of the break when we wrap up this topic back in the minute.