This lab journal demonstrates how first names of doctoral recipients
are used to infer gender, using the mock dataset created in Names
Custom functions
package.check
: Check if packages are installed (and
install if not) in R (source).
rm(list = ls())
fpackage.check <- function(packages) {
lapply(packages, FUN = function(x) {
if (!require(x, character.only = TRUE)) {
install.packages(x, dependencies = TRUE)
library(x, character.only = TRUE)
}
})
}
Packages
tidyverse
: For general data manipulaion
rrapply
: For detecting empty nested lists
stringr
: For string manipulations
dplyr
: for data manipulation
packages = c("tidyverse", "rrapply", "stringr", "dplyr")
fpackage.check(packages)
Method 1: Scraping
Meertens voornamenbank
The first step of our operations is webscraping the Meertens voornamenbank.
This website contains a database of first names of all individuals that
have once been registered in the “Personal Records Database (BRP)”, and
the gender that is indicated on their official government documents.
People who live in the Netherlands for at least 4 months are registered
in the BRP. The Meertens Voornamenbank contains data on names + gender
from 1880 to 2016. We scraped unique first names from our sample of
PhDs.
For the first PhD in our data, we would scrape the Voornamenbank
page for ‘Jan-Willem’.
phdethnicity[1, c(1, 2)]
#> id firstname
#> 1 a jan-willem
This would yield the following data:
In this image, the ‘m’ and ‘v’ sections represent frequencies of the
name among men and women respectively. The first number for each gender
represents the frequency of the name as primary first name, while the
second number counts how often the name occurs as Christian name. We
only extracted the numbers for the primary first name, because Christian
names can be more ambiguous in terms of gender.
janwillemvoornamenbank
When doing this for all names in our sample, we obtain the dataframe
as below.
print(gender, row.names = FALSE)
#> firstname freq_m freq_f
#> jan-willem 1946 0
#> corine 0 2316
#> jan 186731 15
#> monique 0 39481
#> selim 348 0
#> nahid 5 57
#> jacques 3261 5
#> lena 5 9694
Next, we apply a simple majority rule to obtain gender.
gender %>%
mutate(freq_m = ifelse((freq_m == "--"), 0, freq_m), freq_f = ifelse((freq_f == "--"), 0, freq_f),
female = ifelse((freq_f > freq_m), 1, 0), female = ifelse((freq_m == freq_f), NA, female)) ->
gender
# renaming the variable and transforming into a factor
gender$female <- as.factor(gender$female)
levels(gender$female)
#> [1] "0" "1"
levels(gender$female) <- c("men", "women")
gender <- gender %>%
dplyr::rename(gender = female)
# dropping the frequency variables
gender <- subset(gender, select = c(firstname, gender))
Now, we are left with a dataframe that contains first names and
associated genders.
# we add the new variable to our dataframe
phdgender <- left_join(phdethnicity, gender, by = "firstname")
phdgender[, c(2, 11)]
#> firstname gender
#> 1 jan-willem men
#> 2 corine women
#> 3 jan men
#> 4 monique women
#> 5 selim men
#> 6 nahid women
#> 7 jacques men
#> 8 lena women
Method 2:
GenderizeR
Next, we used the (genderizeR)[https://github.com/kalimu/genderizeR] package to
supplement gender information based on the Meertens Voornamenbank.
GenderizeR compiles information on names from different sources
(e.g. social media profiles) and the data is country-coded. To get the
most accurate gender information for our first names, we scraped not
only the global database, but also the specific databases for the
Netherlands, Turkey and Morocco.
To match first names and genders with country-specific info, we first
add ethnicity information as we gathered it in [(ethnicity.html)]
Then, we combine the different genderizer databases into a single
dataframe, to most efficiently extract gender information depending on
ethnicity.
Finally, we add the gender data from specific databases to our PhD
dataframe. In doing so, we give priority to the Meertens Voornamenbank.
We prioritize the Meertens gender information, because this database is
more open with regards to the its sources and therefore we think the
quality of this gender data is likely higher.
# combining the different genderizeR databases into a single dataframe
colnames(genderizer_all) <- c("firstname", "count_all", "gender_all", "probability_all")
colnames(genderizer_tr) <- c("firstname", "count_tr", "gender_tr", "probability_tr")
colnames(genderizer_ma) <- c("firstname", "count_ma", "gender_ma", "probability_ma")
colnames(genderizer_nl) <- c("firstname", "count_nl", "gender_nl", "probability_nl")
genderizer <- cbind.data.frame(genderizer_all, genderizer_nl[, -1], genderizer_tr[, -1], genderizer_ma[,
-1]) # removing the firstname objects to avoid duplicate columns
genderizer$gender_all <- dplyr::recode(genderizer$gender_all, male = "men", female = "women")
genderizer$gender_nl <- dplyr::recode(genderizer$gender_nl, male = "men", female = "women")
genderizer$gender_ma <- dplyr::recode(genderizer$gender_ma, male = "men", female = "women")
genderizer$gender_tr <- dplyr::recode(genderizer$gender_tr, male = "men", female = "women")
# adding genderizeR data to the example PhD dataframe
phdgender <- cbind.data.frame(phdgender, genderizer[, -1])
# we take country-specific genderizeR data where applicable, else the complete genderizeR database
phdgender %>%
mutate(genderZ = gender_all, genderZ = ifelse(ethnicity == "dutch", gender_nl, genderZ), genderZ = ifelse(ethnicity ==
"moroccan", gender_ma, genderZ), genderZ = ifelse(ethnicity == "turkish", gender_tr, genderZ)) ->
phdgender
phdgender$genderZ <- factor(phdgender$genderZ, levels = levels(phdgender$gender))
# Only use genderizeR when gender is not present from Meertens
phdgender$gender <- ifelse(is.na(phdgender$gender), phdgender$genderZ, phdgender$gender)
phdgender$gender <- as.factor(phdgender$gender)
levels(phdgender$gender) <- c("men", "women")
Output
phdgender[, c(2, 5, 11)]
#> firstname lastname_full gender
#> 1 jan-willem verschuuren men
#> 2 corine janssen women
#> 3 jan de vries men
#> 4 monique van vliet women
#> 5 selim aydin men
#> 6 nahid karimi women
#> 7 jacques bernard men
#> 8 lena schneider women
Saving
phdgender <- subset(phdgender, select = c(id, firstname, lastname_full, diss_birthplace, uni, phd_year,
ethnicity, ethnicity2, gender))
save(phdgender, file = "data/processed/phdgender.rda")
References
LS0tDQp0aXRsZTogIlN1Y2Nlc3MgYXMgUEhEIChnZW5kZXIpIg0KYmlibGlvZ3JhcGh5OiByZWZlcmVuY2VzLmJpYg0KZGF0ZTogIkxhc3QgY29tcGlsZWQgb24gYHIgZm9ybWF0KFN5cy50aW1lKCksICclQiwgJVknKWAiDQpvdXRwdXQ6IA0KICBodG1sX2RvY3VtZW50Og0KICAgIGNzczogdHdlYWtzLmNzcw0KICAgIHRvYzogIHRydWUNCiAgICB0b2NfZmxvYXQ6IHRydWUNCiAgICBudW1iZXJfc2VjdGlvbnM6IHRydWUNCiAgICBjb2RlX2ZvbGRpbmc6IHNob3cNCiAgICBjb2RlX2Rvd25sb2FkOiB5ZXMNCi0tLQ0KDQoNCg0KYGBge3IsIGdsb2JhbHNldHRpbmdzLCBlY2hvPUZBTFNFLCB3YXJuaW5nPUZBTFNFLCByZXN1bHRzPSJoaWRlIn0NCg0KbGlicmFyeShrbml0cikNCm9wdHNfY2h1bmskc2V0KHRpZHkub3B0cz1saXN0KHdpZHRoLmN1dG9mZj0xMDApLHRpZHk9VFJVRSwgd2FybmluZyA9IEZBTFNFLCBtZXNzYWdlID0gRkFMU0UsY29tbWVudCA9ICIjPiIsIGNhY2hlPVRSVUUsIGNsYXNzLnNvdXJjZT1jKCJ0ZXN0IiksIGNsYXNzLm91dHB1dD1jKCJ0ZXN0MiIpLCBjYWNoZS5sYXp5ID0gRkFMU0UpDQpvcHRpb25zKHdpZHRoID0gMTAwKQ0KcmdsOjpzZXR1cEtuaXRyKCkNCg0KY29sb3JpemUgPC0gZnVuY3Rpb24oeCwgY29sb3IpIHtzcHJpbnRmKCI8c3BhbiBzdHlsZT0nY29sb3I6ICVzOyc+JXM8L3NwYW4+IiwgY29sb3IsIHgpIH0NCg0KYGBgDQoNCmBgYHtyIGtsaXBweSwgZWNobz1GQUxTRSwgaW5jbHVkZT1UUlVFfQ0Ka2xpcHB5OjprbGlwcHkocG9zaXRpb24gPSBjKCd0b3AnLCAncmlnaHQnKSkNCiNrbGlwcHk6OmtsaXBweShjb2xvciA9ICdkYXJrcmVkJykNCiNrbGlwcHk6OmtsaXBweSh0b29sdGlwX21lc3NhZ2UgPSAnQ2xpY2sgdG8gY29weScsIHRvb2x0aXBfc3VjY2VzcyA9ICdEb25lJykNCmBgYA0KDQoNCg0KDQotLS0tDQoNClRoaXMgbGFiIGpvdXJuYWwgZGVtb25zdHJhdGVzIGhvdyBmaXJzdCBuYW1lcyBvZiBkb2N0b3JhbCByZWNpcGllbnRzIGFyZSB1c2VkIHRvIGluZmVyIGdlbmRlciwgdXNpbmcgdGhlIG1vY2sgZGF0YXNldCBjcmVhdGVkIGluIFtOYW1lc10obmFtZXMuaHRtbCkgICANCiAgDQotLS0tDQoNCiMgQ3VzdG9tIGZ1bmN0aW9ucw0KDQotIGBwYWNrYWdlLmNoZWNrYDogQ2hlY2sgaWYgcGFja2FnZXMgYXJlIGluc3RhbGxlZCAoYW5kIGluc3RhbGwgaWYgbm90KSBpbiBSIChbc291cmNlXShodHRwczovL3ZiYWxpZ2EuZ2l0aHViLmlvL3ZlcmlmeS10aGF0LXItcGFja2FnZXMtYXJlLWluc3RhbGxlZC1hbmQtbG9hZGVkLykpLiANCg0KDQpgYGB7ciwgcmVzdWx0cz0naGlkZSd9DQoNCnJtKGxpc3Q9bHMoKSkNCg0KDQpmcGFja2FnZS5jaGVjayA8LSBmdW5jdGlvbihwYWNrYWdlcykgew0KICBsYXBwbHkocGFja2FnZXMsIEZVTiA9IGZ1bmN0aW9uKHgpIHsNCiAgICBpZiAoIXJlcXVpcmUoeCwgY2hhcmFjdGVyLm9ubHkgPSBUUlVFKSkgew0KICAgICAgaW5zdGFsbC5wYWNrYWdlcyh4LCBkZXBlbmRlbmNpZXMgPSBUUlVFKQ0KICAgICAgbGlicmFyeSh4LCBjaGFyYWN0ZXIub25seSA9IFRSVUUpDQogICAgfQ0KICB9KQ0KfQ0KDQoNCmBgYA0KDQotLS0gIA0KDQojIFBhY2thZ2VzDQoNCi0gYHRpZHl2ZXJzZWA6IEZvciBnZW5lcmFsIGRhdGEgbWFuaXB1bGFpb24NCg0KLSBgcnJhcHBseWA6IEZvciBkZXRlY3RpbmcgZW1wdHkgbmVzdGVkIGxpc3RzICANCg0KLSBgc3RyaW5ncmA6IEZvciBzdHJpbmcgbWFuaXB1bGF0aW9ucw0KDQotIGBkcGx5cmA6IGZvciBkYXRhIG1hbmlwdWxhdGlvbg0KDQoNCmBgYHtyLCByZXN1bHRzPSdoaWRlJ30NCnBhY2thZ2VzID0gYygidGlkeXZlcnNlIiwgInJyYXBwbHkiLCAic3RyaW5nciIsICJkcGx5ciIpDQoNCmZwYWNrYWdlLmNoZWNrKHBhY2thZ2VzKQ0KDQoNCmBgYA0KDQotLS0gDQoNCg0KIyBJbnB1dA0KDQpXZSB1c2UgdHdvIHByb2Nlc3NlZCBkYXRhc2V0cw0KDQoqIFtwaGRldGhuaWNpdHldKGh0dHBzOi8vZ2l0aHViLmNvbS9hbW11bGRlcnMvYW1hdHRlcm9mdGltZS9kYXRhL3Byb2Nlc3NlZC9waGRldGhuaWNpdHkucmRhKTogZXhhbXBsZSBkYXRhc2V0IG9mIDggKGZpY3Rpb25hbCkgUGhEcyB3aXRoIGZpcnN0IGFuZCBsYXN0IG5hbWVzLCBhbmQgZXRobmljaXR5IGF0dGFjaGVkDQogICAgLSBGb3IgY29uc3RydWN0aW9uIG9mIHRoaXMgZGF0YXNldCBzZWUgW0luZGVwZW5kZW50IHZhcmlhYmxlczogbmFtZXNdKG5hbWVzLmh0bWwpICYgW0luZGVwZW5kZW50IHZhcmlhYmxlczogZXRobmljaXR5XShldGhuaWNpdHkuaHRtbCkNCiAgICAtIG5hbWUgb2YgZGF0YXNldDogYHBoZG5hbWVzYCANCg0KKiBbZ2VuZGVyLnJkYV0oaHR0cHM6Ly9naXRodWIuY29tL2FtbXVsZGVycy9hbWF0dGVyb2Z0aW1lL2RhdGEvcHJvY2Vzc2VkL2dlbmRlci5yZGEpOiB3ZWIgc2NyYXBlZCBnZW5kZXIgZGF0YSBmb3IgdGhlIDggZmlyc3QgbmFtZXMgaW4gdGhlIGV4YW1wbGUgZGF0YSwgZnJvbSBNZWVydGVucyBWb29ybmFtZW5iYW5rDQogICAgLSBuYW1lIG9mIGRhdGFzZXQ6IGBnZW5kZXJgIA0KICAgIA0KKiBbZ2VuZGVyaXplcl9hbGwucmRhXShodHRwczovL2dpdGh1Yi5jb20vYW1tdWxkZXJzL2FtYXR0ZXJvZnRpbWUvZGF0YS9nZW5kZXJpemVyX2FsbC5yZGEpOiB3ZWIgc2NyYXBlZCBnZW5kZXIgZGF0YSBmb3IgdGhlIDggZmlyc3QgbmFtZXMgaW4gdGhlIGV4YW1wbGUgZGF0YSwgZnJvbSBHZW5kZXJpemVyIGdsb2JhbA0KICAgIC0gbmFtZSBvZiBkYXRhc2V0OiBgZ2VuZGVyaXplcl9hbGxgDQogICAgDQoqIFtnZW5kZXJpemVyX25sLnJkYV0oaHR0cHM6Ly9naXRodWIuY29tL2FtbXVsZGVycy9hbWF0dGVyb2Z0aW1lL2RhdGEvZ2VuZGVyaXplci9nZW5kZXJpemVyX25sLnJkYSk6IHdlYiBzY3JhcGVkIGdlbmRlciBkYXRhIGZvciB0aGUgOCBmaXJzdCBuYW1lcyBpbiB0aGUgZXhhbXBsZSBkYXRhLCBmcm9tIEdlbmRlcml6ZXIgTmV0aGVybGFuZHMNCiAgICAtIG5hbWUgb2YgZGF0YXNldDogYGdlbmRlcml6ZXJfbmxgIA0KDQoqIFtnZW5kZXJpemVyX21hLnJkYV0oaHR0cHM6Ly9naXRodWIuY29tL2FtbXVsZGVycy9hbWF0dGVyb2Z0aW1lL2RhdGEvZ2VuZGVyaXplci9nZW5kZXJpemVyX21hLnJkYSk6IHdlYiBzY3JhcGVkIGdlbmRlciBkYXRhIGZvciB0aGUgOCBmaXJzdCBuYW1lcyBpbiB0aGUgZXhhbXBsZSBkYXRhLCBmcm9tIEdlbmRlcml6ZXIgTW9yb2Njbw0KICAgIC0gbmFtZSBvZiBkYXRhc2V0OiBgZ2VuZGVyaXplcl9tYWAgDQoNCiogW2dlbmRlcml6ZXJfdHIucmRhXShodHRwczovL2dpdGh1Yi5jb20vYW1tdWxkZXJzL2FtYXR0ZXJvZnRpbWUvZGF0YS9nZW5kZXJpemVyL2dlbmRlcml6ZXJfdHIucmRhKTogd2ViIHNjcmFwZWQgZ2VuZGVyIGRhdGEgZm9yIHRoZSA4IGZpcnN0IG5hbWVzIGluIHRoZSBleGFtcGxlIGRhdGEsIGZyb20gR2VuZGVyaXplciBUdXJrZXkNCiAgICAtIG5hbWUgb2YgZGF0YXNldDogYGdlbmRlcml6ZXJfdHJgIA0KDQoNCmBgYHtyIGZpbGVzLCBldmFsPVRSVUV9DQoNCmxvYWQoZmlsZSA9ICJkYXRhL3Byb2Nlc3NlZC9waGRldGhuaWNpdHkucmRhIikNCg0KbG9hZChmaWxlID0gImRhdGEvcHJvY2Vzc2VkL2dlbmRlci5yZGEiKQ0KDQoNCmxvYWQoZmlsZSA9ICJkYXRhL2dlbmRlcml6ZXIvZ2VuZGVyaXplcl9hbGwucmRhIikNCmxvYWQoZmlsZSA9ICJkYXRhL2dlbmRlcml6ZXIvZ2VuZGVyaXplcl9ubC5yZGEiKQ0KbG9hZChmaWxlID0gImRhdGEvZ2VuZGVyaXplci9nZW5kZXJpemVyX21hLnJkYSIpDQpsb2FkKGZpbGUgPSAiZGF0YS9nZW5kZXJpemVyL2dlbmRlcml6ZXJfdHIucmRhIikNCg0KYGBgDQoNCg0KLS0tICANCg0KDQojIE1ldGhvZCAxOiBTY3JhcGluZyBNZWVydGVucyB2b29ybmFtZW5iYW5rDQoNClRoZSBmaXJzdCBzdGVwIG9mIG91ciBvcGVyYXRpb25zIGlzIHdlYnNjcmFwaW5nIHRoZSBbTWVlcnRlbnMgdm9vcm5hbWVuYmFua10oaHR0cHM6Ly93d3cubWVlcnRlbnMua25hdy5ubC9udmIvKS4gVGhpcyB3ZWJzaXRlIGNvbnRhaW5zIGEgZGF0YWJhc2Ugb2YgZmlyc3QgbmFtZXMgb2YgYWxsIGluZGl2aWR1YWxzIHRoYXQgaGF2ZSBvbmNlIGJlZW4gcmVnaXN0ZXJlZCBpbiB0aGUgIlBlcnNvbmFsIFJlY29yZHMgRGF0YWJhc2UgKEJSUCkiLCBhbmQgdGhlIGdlbmRlciB0aGF0IGlzIGluZGljYXRlZCBvbiB0aGVpciBvZmZpY2lhbCBnb3Zlcm5tZW50IGRvY3VtZW50cy4gUGVvcGxlIHdobyBsaXZlIGluIHRoZSBOZXRoZXJsYW5kcyBmb3IgYXQgbGVhc3QgNCBtb250aHMgYXJlIHJlZ2lzdGVyZWQgaW4gdGhlIEJSUC4gVGhlIE1lZXJ0ZW5zIFZvb3JuYW1lbmJhbmsgY29udGFpbnMgZGF0YSBvbiBuYW1lcyArIGdlbmRlciBmcm9tIDE4ODAgdG8gMjAxNi4gV2Ugc2NyYXBlZCB1bmlxdWUgZmlyc3QgbmFtZXMgZnJvbSBvdXIgc2FtcGxlIG9mIFBoRHMuIA0KDQoNCkZvciB0aGUgZmlyc3QgUGhEIGluIG91ciBkYXRhLCB3ZSB3b3VsZCBzY3JhcGUgdGhlIFtWb29ybmFtZW5iYW5rIHBhZ2UgZm9yICdKYW4tV2lsbGVtJ10oaHR0cHM6Ly93d3cubWVlcnRlbnMua25hdy5ubC9udmIvbmFhbS9pcy9qYW4td2lsbGVtKS4gDQoNCmBgYHtyfQ0KDQpwaGRldGhuaWNpdHlbMSxjKDEsMildDQoNCmBgYA0KDQpUaGlzIHdvdWxkIHlpZWxkIHRoZSBmb2xsb3dpbmcgZGF0YToNCg0KSW4gdGhpcyBpbWFnZSwgdGhlICdtJyBhbmQgJ3YnIHNlY3Rpb25zIHJlcHJlc2VudCBmcmVxdWVuY2llcyBvZiB0aGUgbmFtZSBhbW9uZyBtZW4gYW5kIHdvbWVuIHJlc3BlY3RpdmVseS4gVGhlIGZpcnN0IG51bWJlciBmb3IgZWFjaCBnZW5kZXIgcmVwcmVzZW50cyB0aGUgZnJlcXVlbmN5IG9mIHRoZSBuYW1lIGFzIHByaW1hcnkgZmlyc3QgbmFtZSwgd2hpbGUgdGhlIHNlY29uZCBudW1iZXIgY291bnRzIGhvdyBvZnRlbiB0aGUgbmFtZSBvY2N1cnMgYXMgQ2hyaXN0aWFuIG5hbWUuIFdlIG9ubHkgZXh0cmFjdGVkIHRoZSBudW1iZXJzIGZvciB0aGUgcHJpbWFyeSBmaXJzdCBuYW1lLCBiZWNhdXNlIENocmlzdGlhbiBuYW1lcyBjYW4gYmUgbW9yZSBhbWJpZ3VvdXMgaW4gdGVybXMgb2YgZ2VuZGVyLg0KDQohW2phbndpbGxlbXZvb3JuYW1lbmJhbmtdKC4vbWlzYy9qYW53aWxsZW1fbnZiLnBuZykNCg0KV2hlbiBkb2luZyB0aGlzIGZvciBhbGwgbmFtZXMgaW4gb3VyIHNhbXBsZSwgd2Ugb2J0YWluIHRoZSBkYXRhZnJhbWUgYXMgYmVsb3cuIA0KDQpgYGB7cn0NCg0KcHJpbnQoZ2VuZGVyLCByb3cubmFtZXM9RkFMU0UpDQoNCmBgYA0KDQoNCk5leHQsIHdlIGFwcGx5IGEgc2ltcGxlIG1ham9yaXR5IHJ1bGUgdG8gb2J0YWluIGdlbmRlci4NCg0KYGBge3J9DQoNCmdlbmRlciAlPiUgDQogIG11dGF0ZShmcmVxX20gPSBpZmVsc2UoKGZyZXFfbT09Ii0tIiksIDAsIGZyZXFfbSksIA0KICAgICAgICAgZnJlcV9mID0gaWZlbHNlKChmcmVxX2Y9PSItLSIpLCAwLCBmcmVxX2YpLCANCiAgICAgICAgIGZlbWFsZSA9IGlmZWxzZSgoZnJlcV9mID4gZnJlcV9tKSwgMSwgMCksDQogICAgICAgICBmZW1hbGUgPSBpZmVsc2UoKGZyZXFfbSA9PSBmcmVxX2YpLCBOQSwgZmVtYWxlKSkgLT4gZ2VuZGVyDQoNCg0KIyByZW5hbWluZyB0aGUgdmFyaWFibGUgYW5kIHRyYW5zZm9ybWluZyBpbnRvIGEgZmFjdG9yDQoNCmdlbmRlciRmZW1hbGUgPC0gYXMuZmFjdG9yKGdlbmRlciRmZW1hbGUpDQpsZXZlbHMoZ2VuZGVyJGZlbWFsZSkNCmxldmVscyhnZW5kZXIkZmVtYWxlKSA8LSBjKCJtZW4iLCAid29tZW4iKQ0KZ2VuZGVyIDwtIGdlbmRlciAlPiUgZHBseXI6OnJlbmFtZShnZW5kZXIgPSBmZW1hbGUpDQoNCg0KIyBkcm9wcGluZyB0aGUgZnJlcXVlbmN5IHZhcmlhYmxlcw0KZ2VuZGVyIDwtIHN1YnNldChnZW5kZXIsIHNlbGVjdCA9IGMoZmlyc3RuYW1lLCBnZW5kZXIpKQ0KDQoNCmBgYA0KDQpOb3csIHdlIGFyZSBsZWZ0IHdpdGggYSBkYXRhZnJhbWUgdGhhdCBjb250YWlucyBmaXJzdCBuYW1lcyBhbmQgYXNzb2NpYXRlZCBnZW5kZXJzLiANCg0KYGBge3J9DQoNCiMgd2UgYWRkIHRoZSBuZXcgdmFyaWFibGUgdG8gb3VyIGRhdGFmcmFtZQ0KcGhkZ2VuZGVyIDwtIGxlZnRfam9pbihwaGRldGhuaWNpdHksIGdlbmRlciwgYnk9ImZpcnN0bmFtZSIpDQoNCnBoZGdlbmRlclssYygyLDExKV0NCg0KDQpgYGANCg0KDQojIE1ldGhvZCAyOiBHZW5kZXJpemVSDQoNCk5leHQsIHdlIHVzZWQgdGhlIChnZW5kZXJpemVSKVtodHRwczovL2dpdGh1Yi5jb20va2FsaW11L2dlbmRlcml6ZVJdIHBhY2thZ2UgdG8gc3VwcGxlbWVudCBnZW5kZXIgaW5mb3JtYXRpb24gYmFzZWQgb24gdGhlIE1lZXJ0ZW5zIFZvb3JuYW1lbmJhbmsuIEdlbmRlcml6ZVIgY29tcGlsZXMgaW5mb3JtYXRpb24gb24gbmFtZXMgZnJvbSBkaWZmZXJlbnQgc291cmNlcyAoZS5nLiBzb2NpYWwgbWVkaWEgcHJvZmlsZXMpIGFuZCB0aGUgZGF0YSBpcyBjb3VudHJ5LWNvZGVkLiBUbyBnZXQgdGhlIG1vc3QgYWNjdXJhdGUgZ2VuZGVyIGluZm9ybWF0aW9uIGZvciBvdXIgZmlyc3QgbmFtZXMsIHdlIHNjcmFwZWQgbm90IG9ubHkgdGhlIGdsb2JhbCBkYXRhYmFzZSwgYnV0IGFsc28gdGhlIHNwZWNpZmljIGRhdGFiYXNlcyBmb3IgdGhlIE5ldGhlcmxhbmRzLCBUdXJrZXkgYW5kIE1vcm9jY28uIA0KDQpUbyBtYXRjaCBmaXJzdCBuYW1lcyBhbmQgZ2VuZGVycyB3aXRoIGNvdW50cnktc3BlY2lmaWMgaW5mbywgd2UgZmlyc3QgYWRkIGV0aG5pY2l0eSBpbmZvcm1hdGlvbiBhcyB3ZSBnYXRoZXJlZCBpdCBpbiBbKGV0aG5pY2l0eS5odG1sKV0NCg0KVGhlbiwgd2UgY29tYmluZSB0aGUgZGlmZmVyZW50IGdlbmRlcml6ZXIgZGF0YWJhc2VzIGludG8gYSBzaW5nbGUgZGF0YWZyYW1lLCB0byBtb3N0IGVmZmljaWVudGx5IGV4dHJhY3QgZ2VuZGVyIGluZm9ybWF0aW9uIGRlcGVuZGluZyBvbiBldGhuaWNpdHkuIA0KDQpGaW5hbGx5LCB3ZSBhZGQgdGhlIGdlbmRlciBkYXRhIGZyb20gc3BlY2lmaWMgZGF0YWJhc2VzIHRvIG91ciBQaEQgZGF0YWZyYW1lLiBJbiBkb2luZyBzbywgd2UgZ2l2ZSBwcmlvcml0eSB0byB0aGUgTWVlcnRlbnMgVm9vcm5hbWVuYmFuay4gV2UgcHJpb3JpdGl6ZSB0aGUgTWVlcnRlbnMgZ2VuZGVyIGluZm9ybWF0aW9uLCBiZWNhdXNlIHRoaXMgZGF0YWJhc2UgaXMgbW9yZSBvcGVuIHdpdGggcmVnYXJkcyB0byB0aGUgaXRzIHNvdXJjZXMgYW5kIHRoZXJlZm9yZSB3ZSB0aGluayB0aGUgcXVhbGl0eSBvZiB0aGlzIGdlbmRlciBkYXRhIGlzIGxpa2VseSBoaWdoZXIuIA0KDQpgYGB7cn0NCg0KIyBjb21iaW5pbmcgdGhlIGRpZmZlcmVudCBnZW5kZXJpemVSIGRhdGFiYXNlcyBpbnRvIGEgc2luZ2xlIGRhdGFmcmFtZQ0KY29sbmFtZXMoZ2VuZGVyaXplcl9hbGwpIDwtIGMoImZpcnN0bmFtZSIsICJjb3VudF9hbGwiLCAiZ2VuZGVyX2FsbCIsICJwcm9iYWJpbGl0eV9hbGwiKQ0KY29sbmFtZXMoZ2VuZGVyaXplcl90cikgPC0gYygiZmlyc3RuYW1lIiwgImNvdW50X3RyIiwgImdlbmRlcl90ciIsICJwcm9iYWJpbGl0eV90ciIpDQpjb2xuYW1lcyhnZW5kZXJpemVyX21hKSA8LSBjKCJmaXJzdG5hbWUiLCAiY291bnRfbWEiLCAiZ2VuZGVyX21hIiwgInByb2JhYmlsaXR5X21hIikNCmNvbG5hbWVzKGdlbmRlcml6ZXJfbmwpIDwtIGMoImZpcnN0bmFtZSIsICJjb3VudF9ubCIsICJnZW5kZXJfbmwiLCAicHJvYmFiaWxpdHlfbmwiKQ0KDQoNCmdlbmRlcml6ZXIgPC0gY2JpbmQuZGF0YS5mcmFtZShnZW5kZXJpemVyX2FsbCwgZ2VuZGVyaXplcl9ubFssLTFdLCBnZW5kZXJpemVyX3RyWywtMV0sIGdlbmRlcml6ZXJfbWFbLC0xXSkgIyByZW1vdmluZyB0aGUgZmlyc3RuYW1lIG9iamVjdHMgdG8gYXZvaWQgZHVwbGljYXRlIGNvbHVtbnMNCg0KDQpnZW5kZXJpemVyJGdlbmRlcl9hbGwgPC0gZHBseXI6OnJlY29kZShnZW5kZXJpemVyJGdlbmRlcl9hbGwsICJtYWxlIj0ibWVuIiwgImZlbWFsZSI9IndvbWVuIikNCmdlbmRlcml6ZXIkZ2VuZGVyX25sIDwtIGRwbHlyOjpyZWNvZGUoZ2VuZGVyaXplciRnZW5kZXJfbmwsICJtYWxlIj0ibWVuIiwgImZlbWFsZSI9IndvbWVuIikNCmdlbmRlcml6ZXIkZ2VuZGVyX21hIDwtIGRwbHlyOjpyZWNvZGUoZ2VuZGVyaXplciRnZW5kZXJfbWEsICJtYWxlIj0ibWVuIiwgImZlbWFsZSI9IndvbWVuIikNCmdlbmRlcml6ZXIkZ2VuZGVyX3RyIDwtIGRwbHlyOjpyZWNvZGUoZ2VuZGVyaXplciRnZW5kZXJfdHIsICJtYWxlIj0ibWVuIiwgImZlbWFsZSI9IndvbWVuIikNCg0KICAgIA0KDQojIGFkZGluZyBnZW5kZXJpemVSIGRhdGEgdG8gdGhlIGV4YW1wbGUgUGhEIGRhdGFmcmFtZQ0KcGhkZ2VuZGVyIDwtIGNiaW5kLmRhdGEuZnJhbWUocGhkZ2VuZGVyLCBnZW5kZXJpemVyWywtMV0pDQoNCiMgd2UgdGFrZSBjb3VudHJ5LXNwZWNpZmljIGdlbmRlcml6ZVIgZGF0YSB3aGVyZSBhcHBsaWNhYmxlLCBlbHNlIHRoZSBjb21wbGV0ZSBnZW5kZXJpemVSIGRhdGFiYXNlDQpwaGRnZW5kZXIgJT4lDQogIG11dGF0ZShnZW5kZXJaID0gZ2VuZGVyX2FsbCwNCiAgICAgICAgIGdlbmRlclogPSBpZmVsc2UoZXRobmljaXR5PT0iZHV0Y2giLCBnZW5kZXJfbmwsIGdlbmRlclopLA0KICAgICAgICAgZ2VuZGVyWiA9IGlmZWxzZShldGhuaWNpdHk9PSJtb3JvY2NhbiIsIGdlbmRlcl9tYSwgZ2VuZGVyWiksDQogICAgICAgICBnZW5kZXJaID0gaWZlbHNlKGV0aG5pY2l0eT09InR1cmtpc2giLCBnZW5kZXJfdHIsIGdlbmRlclopKSAtPiBwaGRnZW5kZXINCg0KcGhkZ2VuZGVyJGdlbmRlclogPC0gZmFjdG9yKHBoZGdlbmRlciRnZW5kZXJaLCBsZXZlbHM9bGV2ZWxzKHBoZGdlbmRlciRnZW5kZXIpKQ0KDQojIE9ubHkgdXNlIGdlbmRlcml6ZVIgd2hlbiBnZW5kZXIgaXMgbm90IHByZXNlbnQgZnJvbSBNZWVydGVucw0KcGhkZ2VuZGVyJGdlbmRlciA8LSBpZmVsc2UoaXMubmEocGhkZ2VuZGVyJGdlbmRlciksIHBoZGdlbmRlciRnZW5kZXJaLCBwaGRnZW5kZXIkZ2VuZGVyKQ0KDQpwaGRnZW5kZXIkZ2VuZGVyIDwtIGFzLmZhY3RvcihwaGRnZW5kZXIkZ2VuZGVyKQ0KbGV2ZWxzKHBoZGdlbmRlciRnZW5kZXIpIDwtIGMoIm1lbiIsICJ3b21lbiIpDQoNCmBgYA0KDQoNCg0KIyBPdXRwdXQgDQoNCmBgYHtyfQ0KDQpwaGRnZW5kZXJbLGMoMiw1LDExKV0NCg0KYGBgDQoNCg0KU2F2aW5nDQoNCmBgYHtyLCBldmFsPUZBTFNFfQ0KDQpwaGRnZW5kZXIgPC0gc3Vic2V0KHBoZGdlbmRlciwgc2VsZWN0PWMoaWQsIGZpcnN0bmFtZSwgbGFzdG5hbWVfZnVsbCwgZGlzc19iaXJ0aHBsYWNlLCB1bmksIHBoZF95ZWFyLCBldGhuaWNpdHksIGV0aG5pY2l0eTIsIGdlbmRlcikpDQoNCnNhdmUocGhkZ2VuZGVyLCBmaWxlPSJkYXRhL3Byb2Nlc3NlZC9waGRnZW5kZXIucmRhIikNCg0KYGBgDQoNCg0KDQoNCi0tLSAgDQoNCiMgUmVmZXJlbmNlcw0KDQo=
Copyright © 2023