我一直在通过玩不同的运动统计数据来教自己一些R,但我遇到了困难。
match_id player_name player_team points
Match1 Player 1 Team 1 20
Match1 Player 2 Team 1 23
Match1 Player 3 Team 1 24
Match1 Player 4 Team 2 26
Match1 Player 5 Team 2 21
Match1 Player 6 Team 2 22
Match1 Player 7 Team 2 43
Match1 Player 8 Team 2 38
Match2 Player 9 Team 3 24
Match2 Player 10 Team 3 29
Match2 Player 11 Team 3 23
Match2 Player 12 Team 3 22
Match2 Player 13 Team 4 20
Match2 Player 14 Team 4 32
Match3 Player 15 Team 5 24
Match3 Player 16 Team 5 27
Match3 Player 17 Team 5 23
Match3 Player 18 Team 5 20
Match3 Player 19 Team 5 23
数据贯穿了整个赛季,所以球队和球员会随着时间的推移而重复。我试图利用上述方法,找到同一支球队的3名不同球员的所有组合,他们在一场比赛中获得20分或更多(分数已经过滤到只包括20分),然后找到每个组合出现在多少场比赛中,以便告诉我同一支球队的哪一组3名球员在一起比赛时经常得分20分。
由于不同球队的一些球员有相同的名字,我使用突变来结合player_team和player_name以及player_team和match_id,只是因为一些尝试最终结合了来自不同球队的球员。
我能得到的最接近的是使用下面的代码,但它只适用于2的组合。
data <- players %>%
filter(disposals >= 20)
data <- data %>%
select(match_id, player_name, player_team)
data <- data %>%
mutate(match_id = paste(player_team, match_id, sep = "_"))%>%
mutate(player_name = paste(player_team, player_name, sep = "_"))
data <- data %>%
select(match_id, player_name)
dataout <- get.data.frame(
graph_from_adjacency_matrix(
crossprod(table(data)),
mode = "directed",
weighted = TRUE,
diag = FALSE,
)
)
这给了我下面的信息(权重是基于整个数据集的事件,而不是上面的例子,到目前为止每个队都打了3场比赛)
请注意,组合不会在所有可能的顺序中重复(即认识到团队1_Player1团队1_Player2与团队1_Player2团队1_Player1相同)
有没有其他解决方案可以让我包括三名(或更多)球员,而不仅仅是两名?
您可以使用函数compn(m=3)
来获取所有可能的三元组:
library(tidyverse)
data <- tribble(
~match_id, ~player_name, ~team_name, ~points,
"Match1", 1L, 1L, 20L,
"Match1", 2L, 1L, 23L,
"Match1", 3L, 1L, 24L,
"Match1", 4L, 2L, 26L,
"Match1", 5L, 2L, 21L,
"Match1", 6L, 2L, 22L,
"Match1", 7L, 2L, 43L,
"Match1", 8L, 2L, 38L,
"Match2", 9L, 3L, 24L,
"Match2", 10L, 3L, 29L,
"Match2", 11L, 3L, 23L,
"Match2", 12L, 3L, 22L,
"Match2", 13L, 4L, 20L,
"Match2", 14L, 4L, 32L,
"Match3", 15L, 5L, 24L,
"Match3", 16L, 5L, 27L,
"Match3", 17L, 5L, 23L,
"Match3", 18L, 5L, 20L,
"Match3", 19L, 5L, 23L
)
combinations_data <-
data %>%
filter(points >= 20) %>%
nest(-c(team_name, match_id)) %>%
mutate(
combinations = data %>% map(possibly(~ {
.x$player_name %>% unique() %>% combn(3)
}, NA))
)
#> Warning: All elements of `...` must be named.
#> Did you want `data = -c(team_name, match_id)`?
combinations_data %>%
filter(match_id == "Match1" & team_name == 2) %>%
pull(combinations) %>%
first()
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 4 4 4 4 4 4 5 5 5 6
#> [2,] 5 5 5 6 6 7 6 6 7 7
#> [3,] 6 7 8 7 8 8 7 8 8 8
由reprex包(v2.0.0)于2022-04-08创建
在第一场比赛中,第二队有10个独特的球员组合,得分都高于20分。