I am working on a project for school and I am having trouble extracting information from my dataset and plotting it. I am using the dataset "movies" and the package "ggplot2movies". My task is to filter six movie genres (Action, Comedy, Romance, Drama, Documentary, and Animation) ranging from the years 1990 to 2005, and plot the number of movies produced by year. Each genre should have "one curve", and all curves must be plotted in the same figure. It must include an appropriate legend and have different colors.
我正在为学校做一个项目,我在从数据集中提取信息和绘制它时遇到了麻烦。我使用的是数据集“movies”和软件包“ggplot2movies”。我的任务是筛选出1990年至2005年的六种电影类型(动作片、喜剧片、爱情片、戏剧片、纪录片和动画片),并绘制出每年生产的电影数量。每种类型都应该有“一条曲线”,所有的曲线都必须绘制在同一个图中。它必须包含适当的图例并具有不同的颜色。
I already have some the code, but it not running as expected. I am obviously new to this and could use assistance. The code can be found here:
我已经有了一些代码,但它没有像预期的那样运行。我显然是个新手,可能需要帮助。代码可以在以下位置找到:
library(ggplot2movies)
data(movies)
filtered_movies <- movies %>% filter(year >= 1990 & year <= 2005, Action ==1, Comedy == 1, Animation == 1, Drama == 1, Documentary ==1, Romance == 1)
Genresix = tibble(Genre = c("Action", "Animation", "Comedy", "Drama",
"Documentary", "Romance"),
Movies1990 = c(filtered_movies$Action,
filtered_movies$Animation,
filtered_movies$Comedy,
filtered_movies$Drama,
filtered_movies$Documentary,
filtered_movies$Romance
))
ggplot(data = movies, mapping = aes(x = Genre, y = years)) +
geom_line(size = 5)+ geom_line(colour = "red")
plot <- ggplot(filtered_movies, aes(x = year, color = genre)) +
geom_line(stat = "count", aes(group = genre)) +
labs(title = "Number of Movies by Genre (1990-2005)",
x = "Year",
y = "Number of Movies") +
theme_minimal() +
scale_color_discrete(name = "Genre")
# Display the plot
print(plot)
I have an idea of what to do, but the specifics of how to translate my ideas into R, I am not sure. Feel free to peruse the "movies" dataset yourself. The Genre "Short" is NOT suppose to be included for the assignment.
我有一个想法,但如何将我的想法转化为R的细节,我不确定。您可以自己仔细阅读“电影”数据集。“短篇”这一流派不应该包括在作业中。
I am trying to plot six movie genres (Action, Comedy, Romance, Documentary, Drama, and Animation) from the dataset "movies" using R. The graph must include an appropriate legend and different colors.
There should be one curve plotted for each genre and all curves must be in the same figure. I tried using ggplot, but something is missing. I expected a plot of each genre by year (1990 to 2005). It states:
我试图绘制六个电影流派(动作,喜剧,浪漫,纪录片,戏剧和动画)从数据集“电影”使用R。该图表必须包括一个适当的图例和不同的颜色。每个流派都应该绘制一条曲线,所有的曲线都必须在同一张图中。我试着使用ggploy,但缺少一些东西。我预计每一种流派的剧情都会按年份(1990年至2005年)列出。它写道:
Error in `tibble()`:
! Tibble columns must have compatible sizes.
• Size 6: Existing data.
• Size 0: Column `Movies1990`.
ℹ Only values of size one are recycled.
Backtrace:
1. tibble::tibble(...)
I am obviously a novice and could use some help.
The code can be found below:
我显然是个新手,需要一些帮助。代码可以在下面找到:
library(ggplot2movies)
data(movies)
filtered_movies <- movies %>% filter(year >= 1990 & year <= 2005, Action ==1, Comedy == 1, Animation == 1, Drama == 1, Documentary ==1, Romance == 1)
Genresix = tibble(Genre = c("Action", "Animation", "Comedy", "Drama",
"Documentary", "Romance"),
Movies1990 = c(filtered_movies$Action,
filtered_movies$Animation,
filtered_movies$Comedy,
filtered_movies$Drama,
filtered_movies$Documentary,
filtered_movies$Romance
))
ggplot(data = movies, mapping = aes(x = Genre, y = years)) +
geom_line(size = 5)+ geom_line(colour = "red")
plot <- ggplot(filtered_movies, aes(x = year, color = genre)) +
geom_line(stat = "count", aes(group = genre)) +
labs(title = "Number of Movies by Genre (1990-2005)",
x = "Year",
y = "Number of Movies") +
theme_minimal() +
scale_color_discrete(name = "Genre")
# Display the plot
print(plot)
更多回答
Can you rephrase your question's title so it won't be specific to your problem? I can't ven tell what language you're using. Also you wrote most of the question twice
你能重新措辞你的问题的标题,使它不会具体到你的问题?我都不知道你在说什么语言。而且你把大部分问题都写了两遍
You would probably be better off aggregating the data first, to end up with a dataset that has years, number of movies and genre. Then you can plot lines where x is years and y is number of movies where the color aesthetic is genre.
你最好先汇总数据,最终得到一个包含年份、电影数量和类型的数据集。然后你可以绘制线,其中x是年,y是电影的数量,其中颜色美学是流派。
There are multiple issues with your code. First, your filter
statement will keep only films with a "1" in each of the six genre columns and apparently there is no film which meets this condition. Second, there are easier ways to stack multiple data frame columns, i.e. you could use tidy::pivot_longer
to reshape your data, then filter
for films which belong to one of the six genres.
您的代码存在多个问题。首先,您的筛选器声明将只保留六个类型列中的每一列中具有“1”的电影,显然没有符合此条件的电影。其次,有更简单的方法来堆叠多个数据帧列,即可以使用tidy::Pivot_Long重塑数据,然后过滤属于六种流派之一的电影。
After doing so your plotting code works fine:
这样做之后,您的绘图代码就可以正常工作了:
library(ggplot2)
library(tidyr)
library(dplyr, warn=FALSE)
library(ggplot2movies)
filtered_movies <- movies %>%
filter(
year >= 1990 & year <= 2005,
) |>
pivot_longer(c(
Action, Comedy, Animation,
Drama, Documentary, Romance
), names_to = "genre") |>
filter(value == "1") |>
select(title, year, genre)
ggplot(filtered_movies, aes(x = year, color = genre)) +
geom_line(stat = "count", aes(group = genre)) +
labs(
title = "Number of Movies by Genre (1990-2005)",
x = "Year",
y = "Number of Movies"
) +
theme_minimal() +
scale_color_discrete(name = "Genre")
更多回答
我是一名优秀的程序员,十分优秀!